{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Sepsis Detection"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Objetivo"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A sepse é uma síndrome clínica decorrente de uma infecção associada a uma inflamação sistêmica, onde fatores patogênicos e características do hospedeiro (idade, comorbidades, genética, ambiente) determinam a gravidade e a evolução da doença.\n",
"\n",
"No Brasil, segundo o ILAS (Instituto Latino Americano da Sepse), de acordo com o seu último relatório nacional, a mortalidade por sepse é de 40%, incluindo hospitais públicos e privados. Além do impacto na mortalidade, devido à complexidade dos casos, metade dos pacientes precisam ser tratados em Unidades de Terapia Intensiva (UTI) totalizando 25% da ocupação dos leitos de UTI no Brasil, sendo a sepse uma das doenças que gera mais custos no setor público e privado da saúde no país.\n",
"\n",
"O grande problema com o diagnóstico preciso de sepse é que ele envolve a obtenção de amostras de sangue e urina do paciente, e até que sejam aferidas às quantidades relativas de lactato, leucócitos e proteína C reativa ou feita a cultura de urina, que são indicadores mais precisos de sepse, o paciente pode apresentar uma piora exponencial no seu quadro clínico.\n",
"\n",
"Trechos retirados da dissertação de mestrado de Aline Junskowski Kalil. [1]\n",
"\n",
"Kalil, A. J. (2017). Avaliação do impacto na identificação de pacientes com risco de sepse após implantação de um robô cognitivo gerenciador de riso (Robô Laura) (Master's thesis, Universidade Tecnológica Federal do Paraná)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "1CpTG2MBKlU1"
},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"np.random.seed(2021)\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "0pqhlkWYLRQ-"
},
"outputs": [],
"source": [
"df_test= pd.read_csv(\"test_data_without_label.csv\")\n",
"df_train = pd.read_csv(\"training_data.csv\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6fEMB5dJL3MD"
},
"source": [
"## Conhecendo os dados\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xBXyvTrqL7Fc"
},
"source": [
"ID \n",
"N° de Atendimento (N° do Paciente) \n",
"Temperatura \n",
"Pulso \n",
"Respiração \n",
"Pa_min (Pressão Mínima) \n",
"Pa_max (Pressão Máxima)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"id": "ZFzlc3jtLtxA",
"outputId": "3885e14f-1695-48f9-a334-1f4fa306a2ed"
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" id \n",
" num_atend \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 1 \n",
" 6066066 \n",
" 36.0 \n",
" 117.0 \n",
" NaN \n",
" 113.0 \n",
" 72.0 \n",
" 1 \n",
" \n",
" \n",
" 1 \n",
" 2 \n",
" 6019916 \n",
" 36.0 \n",
" 105.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 1 \n",
" \n",
" \n",
" 2 \n",
" 3 \n",
" 6000000 \n",
" 38.0 \n",
" 118.0 \n",
" NaN \n",
" 110.0 \n",
" 70.0 \n",
" 1 \n",
" \n",
" \n",
" 3 \n",
" 4 \n",
" 5993343 \n",
" 37.0 \n",
" 136.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 1 \n",
" \n",
" \n",
" 4 \n",
" 5 \n",
" 6001799 \n",
" 37.0 \n",
" 104.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 1 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" id num_atend temperatura pulso respiracao pa_min pa_max sepse\n",
"0 1 6066066 36.0 117.0 NaN 113.0 72.0 1\n",
"1 2 6019916 36.0 105.0 NaN NaN NaN 1\n",
"2 3 6000000 38.0 118.0 NaN 110.0 70.0 1\n",
"3 4 5993343 37.0 136.0 NaN NaN NaN 1\n",
"4 5 6001799 37.0 104.0 NaN NaN NaN 1"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_train.head()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"id": "BiBvuZcPPdg-"
},
"outputs": [],
"source": [
"id = df_test.id #sera utilizado para prever depois\n",
"\n",
"df_train.drop(['id', 'num_atend'], axis=1, inplace= True)\n",
"\n",
"df_test.drop(['id', 'num_atend'], axis=1, inplace= True)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 284
},
"id": "h7zNVT9BO_tg",
"outputId": "47b405bd-2304-4d5e-9203-b46fa0c5f3f0"
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAEICAYAAABPgw/pAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAV90lEQVR4nO3de5BcZ33m8e/jsQUYbAPRcLEkLAcERCSGJULA1rI4F4IMGEGWVGSSNTjJCqVwyG4SgtgLSwIpcAG7VBYTrSCKK5BFoYAYEcSaZTfmzqJxYi4yERkEWIMMHl/A2LAImd/+0Udw3O6ZOSN6PNbx91PVVX3O+/bbvz7d8/Q7b99SVUiSTnwnLXcBkqTxMNAlqScMdEnqCQNdknrCQJeknjDQJaknDHTdLSR5WJJbk0wswdhfSfKLYx7zsiSvHueYi7juc5PMLMd16+7NQD/BNOH03Sb8jp3OXO66flxVdW1V3a+qbl/uWsZtOcN/WJLVSd6d5IYk30ryuSQvXO66NB4nL3cBOi7nV9WH5mpMcnJVHb0rC9IJ423AZ4CzgO8BPwM8ZFkr0tg4Q++JJJXkxUn+CfinZt+zklyd5JtJPpHknFb/f5bk75N8O8lfJ9l9bBaZ5IVJPjZi/Ec05++V5PVJrk3yjSQ7ktynaTs3yUyS309yfZLrklzUGuc+Sd6Q5KvNDPFjzb61zXWc3PS7KMkXmvoOJnnRPLf94Un+T5Ibm5nnXyW5/1C3JyS5JsnNSf4iyb2by65M8rfNMbopyUeTnNS0/VSSK5u2/UmePcf1z3m8kmwFfg34w+a/qfc17duTfKm5fdckee48t+8+zSz/5iTXAE8Yaj+zmXXPJvlykpfMNVZz2cuq6raqOlpV/1BVH2iN9aTmsfLNJJ9Jcm6r7cokr0ny6ea+e2+SBzZt907y9uY++GaSfUke3LSdkeTPm8fC15K8OkuwtCagqjydQCfgK8AvjthfwP8CHgjcB3g8cD3wRGACeEFz2XsBK4CvAv8OOAV4HvB94NXNWC8EPjZi/Ec0598I7Gmu6zTgfcBrmrZzgaPAHzdjPwP4DvCApv1S4EpgVVPXP29qWttcx8lNv2cCDwcCPLUZ4/FzHJNHAE9rxpkEPgK8ceiYfR5Y09T88dZtfQ2wo6n1FOApzXWeAkwD/745Xj8PfBt4VHO5yxZxvH7Yt9X+K8CZDCZVvwrcBjx0jtv3WuCjTe1rmtsy07SdBFwFvKKp8yeBg8DT5xjrQ83t3wI8bKhtFXBjc5+d1BzTG4HJpv1K4GvATwP3Bd4NvL1pe1HzODi1uV9/Fji9absc+O/NZR4EfBp40XL/LfXxtOwFeFrkHTYIp1uBbzany5v9Bfx8q9+fAa8auuyBJhz/JXAYSKvtE10Cqgm724CHt9qeDHy5OX8u8F2aYG72XQ88qQmJ7wKPHXG71tIK9BHtlwO/2/EYPQf4h6Fjtq21/QzgS835PwbeSxO+rT5PAb4OnNTa9w7glc35y7ocr+G+89R8NbB5jraDwKbW9lZ+FOhPBK4d6v9y4C/mGOsBDJ4g9gO3N9f7hKbtZcDbhvpfAbygOX8l8NpW23rgCIMA/43mMXTO0OUfzGBp5z6tfRcAf7fcf0t9PLmGfmJ6To1eQz/UOn8W8IIkv9Pat4LBrLCAr1Xz19X4asfrnmQwC7sqybF9YfBHfcyNdcc1/O8A9wNWAvcGvrTQlSQ5D/jPwCMZPBGcCnxujr4PAv6UQQif1vS/eahb+9h8lcFxAHgd8Ergg83t2VlVr23aD1XVD4Yut2qh2rtIciHwewyeyOBHx2eUM7lz/cecBZyZ5JutfRMMZvR3UlU3A9uB7UlWAq8HLk+yuhnrV5Kc37rIKcDftbaH6zilqfttDP572N0sd70d+A/NmKcA17UeLycNjaMxcQ29X9oBfQj4k6q6f+t0alW9A7gOWJXWXxjwsNb52xgEKABJ2i+a3cBglv2Y1rhnVNX9OtR3A/D/GCylzCnJvRj8O/964MFVdX9gL4MnjlFew+C2n1NVpwO/PqLvmtb5hzH4D4Wq+nZV/X5V/SRwPvB7SX6haV9zbD29dbmvjbj++Y4X3PF+IclZwFuAi4GfaG7f5+e5fdeNqP+YQwz+O2rfz6dV1TPmGOtHRVXdwOAYn8lgOecQgxl6e6z7Nk9wxwzX8X3ghqr6flX9UVWtZ7CM9izgwmbM7wErW2OeXlWPWag+LZ6B3l9vAbYleWIG7pvkmUlOAz7JYJ37JUlOTvLLwMbWZT8DPCbJ45oXD195rKGZsb4F+K/NzJgkq5I8faGCmsvuAv5L80LeRJInNwHetoLBevgscLSZrf/SPEOfRrMMlWQV8NIRfV6cwVv2HshgXfyvm9qf1bx4GeAWBssQtwP/l0FQ/2GSU5oXB88Hdo8Ye87j1fgGg7XtY+7LIORnmxouYrAuPZd3Ai9P8oBmJt3+r+vTwC1JXta8eDqR5KeTPGHUQEkuadpPbh4Lvw1MV9WNDGbV5yd5ejPOvTN4kXt1a4hfT7I+yakMlqveVVW3J/m5JD/TvNh5C4Ogv72qrgM+CLwhyelJTsrgReynznN7dZwM9J6qqing3wBvYrD8MM1grZeqOgL8crN9M4MX5d7TuuwXGfyxfojBO2bu8A4OBmut08CnktzS9HtUx9L+gMHSyT7gJuAShh6HVfVt4CUMguxm4PkMXoSdyx8xeBH4W8D727el5X8wCJaDzenY+8LXNfXfyuCJ7s1VdWVzjJ4NnMfgP4s3AxdW1T8OD9zheP05sL5598flVXUN8Ibm+r7B4K2DH1/g9n0V+HJzG97Wuu7bGTzRPK5pvwF4K3DGHGOdCvwNg9dfDjJYEnl2M9YhYDODJ7xZBrPrl3LH++dtDF4T+DqD5bNj76h5CPAuBmH+BeDDDJ4gYDBTXwFcw+D+fBfw0Hlur45T7riMqnuqJJcxeKHtPy53Lbp7SnIlg3e1vHW5a9FoztAlqScMdEnqCZdcJKknnKFLUk8s2weLVq5cWWvXrl2uq5ekE9JVV111Q1VNjmpbtkBfu3YtU1NTy3X1knRCSjLnp7pdcpGknjDQJaknDHRJ6gkDXZJ6wkCXpJ4w0CWpJwx0SeoJA12SesJAl6SeOCF/U3Tt9vcvdwm6G/vKa5+53CVIy8IZuiT1hIEuST1hoEtSTxjoktQTBrok9YSBLkk9YaBLUk90CvQkm5IcSDKdZPuI9jOSvC/JZ5LsT3LR+EuVJM1nwUBPMgFcCpwHrAcuSLJ+qNuLgWuq6rHAucAbkqwYc62SpHl0maFvBKar6mBVHQF2A5uH+hRwWpIA9wNuAo6OtVJJ0ry6BPoq4FBre6bZ1/Ym4KeAw8DngN+tqh8MD5Rka5KpJFOzs7PHWbIkaZQugZ4R+2po++nA1cCZwOOANyU5/U4XqtpZVRuqasPk5OQiS5UkzadLoM8Aa1rbqxnMxNsuAt5TA9PAl4FHj6dESVIXXQJ9H7AuydnNC51bgD1Dfa4FfgEgyYOBRwEHx1moJGl+C359blUdTXIxcAUwAeyqqv1JtjXtO4BXAZcl+RyDJZqXVdUNS1i3JGlIp+9Dr6q9wN6hfTta5w8DvzTe0iRJi+EnRSWpJwx0SeoJA12SesJAl6SeMNAlqScMdEnqCQNdknrCQJeknjDQJaknDHRJ6gkDXZJ6wkCXpJ4w0CWpJwx0SeoJA12SesJAl6Se6BToSTYlOZBkOsn2Ee0vTXJ1c/p8ktuTPHD85UqS5rJgoCeZAC4FzgPWAxckWd/uU1Wvq6rHVdXjgJcDH66qm5agXknSHLrM0DcC01V1sKqOALuBzfP0vwB4xziKkyR11yXQVwGHWtszzb47SXIqsAl49xztW5NMJZmanZ1dbK2SpHl0CfSM2Fdz9D0f+Phcyy1VtbOqNlTVhsnJya41SpI66BLoM8Ca1vZq4PAcfbfgcoskLYsugb4PWJfk7CQrGIT2nuFOSc4Angq8d7wlSpK6OHmhDlV1NMnFwBXABLCrqvYn2da072i6Phf4YFXdtmTVSpLmtGCgA1TVXmDv0L4dQ9uXAZeNqzBJ0uL4SVFJ6gkDXZJ6wkCXpJ4w0CWpJwx0SeoJA12SesJAl6SeMNAlqScMdEnqCQNdknrCQJeknjDQJaknDHRJ6gkDXZJ6wkCXpJ7oFOhJNiU5kGQ6yfY5+pyb5Ook+5N8eLxlSpIWsuAPXCSZAC4Fnsbg90X3JdlTVde0+twfeDOwqaquTfKgJapXkjSHLjP0jcB0VR2sqiPAbmDzUJ/nA++pqmsBqur68ZYpSVpIl0BfBRxqbc80+9oeCTwgyZVJrkpy4aiBkmxNMpVkanZ29vgqliSN1CXQM2JfDW2fDPws8Ezg6cB/SvLIO12oamdVbaiqDZOTk4suVpI0ty4/Ej0DrGltrwYOj+hzQ1XdBtyW5CPAY4EvjqVKSdKCuszQ9wHrkpydZAWwBdgz1Oe9wFOSnJzkVOCJwBfGW6okaT4LztCr6miSi4ErgAlgV1XtT7Ktad9RVV9I8j+BzwI/AN5aVZ9fysIlSXfUZcmFqtoL7B3at2No+3XA68ZXmiRpMfykqCT1hIEuST1hoEtSTxjoktQTBrok9YSBLkk9YaBLUk8Y6JLUEwa6JPWEgS5JPWGgS1JPGOiS1BMGuiT1hIEuST1hoEtSTxjoktQTnQI9yaYkB5JMJ9k+ov3cJN9KcnVzesX4S5UkzWfBXyxKMgFcCjyNwY9B70uyp6quGer60ap61hLUKEnqoMsMfSMwXVUHq+oIsBvYvLRlSZIWq0ugrwIOtbZnmn3DnpzkM0k+kOQxowZKsjXJVJKp2dnZ4yhXkjSXLoGeEftqaPvvgbOq6rHAfwMuHzVQVe2sqg1VtWFycnJRhUqS5tcl0GeANa3t1cDhdoequqWqbm3O7wVOSbJybFVKkhbUJdD3AeuSnJ1kBbAF2NPukOQhSdKc39iMe+O4i5UkzW3Bd7lU1dEkFwNXABPArqran2Rb074DeB7w20mOAt8FtlTV8LKMJGkJLRjo8MNllL1D+3a0zr8JeNN4S5MkLYafFJWknjDQJaknDHRJ6gkDXZJ6wkCXpJ4w0CWpJwx0SeoJA12SesJAl6SeMNAlqScMdEnqCQNdknrCQJeknjDQJaknDHRJ6olOgZ5kU5IDSaaTbJ+n3xOS3J7keeMrUZLUxYKBnmQCuBQ4D1gPXJBk/Rz9LmHwy0aSpLtYlxn6RmC6qg5W1RFgN7B5RL/fAd4NXD/G+iRJHXUJ9FXAodb2TLPvh5KsAp4L7GAeSbYmmUoyNTs7u9haJUnz6BLoGbFv+Aeg3wi8rKpun2+gqtpZVRuqasPk5GTHEiVJXXT5kegZYE1rezVweKjPBmB3EoCVwDOSHK2qy8dRpCRpYV0CfR+wLsnZwNeALcDz2x2q6uxj55NcBvytYS5Jd60FA72qjia5mMG7VyaAXVW1P8m2pn3edXNJ0l2jywydqtoL7B3aNzLIq+qFP35ZkqTF8pOiktQTBrok9YSBLkk9YaBLUk8Y6JLUEwa6JPWEgS5JPWGgS1JPGOiS1BMGuiT1hIEuST1hoEtSTxjoktQTBrok9YSBLkk90SnQk2xKciDJdJLtI9o3J/lskqubH4H+F+MvVZI0nwV/4CLJBHAp8DQGvy+6L8meqrqm1e1/A3uqqpKcA7wTePRSFCxJGq3LDH0jMF1VB6vqCLAb2NzuUFW3VlU1m/cFCknSXapLoK8CDrW2Z5p9d5DkuUn+EXg/8BvjKU+S1FWXQM+IfXeagVfV31TVo4HnAK8aOVCytVljn5qdnV1UoZKk+XUJ9BlgTWt7NXB4rs5V9RHg4UlWjmjbWVUbqmrD5OTkoouVJM2tS6DvA9YlOTvJCmALsKfdIckjkqQ5/3hgBXDjuIuVJM1twXe5VNXRJBcDVwATwK6q2p9kW9O+A/hXwIVJvg98F/jV1oukkqS7wIKBDlBVe4G9Q/t2tM5fAlwy3tIkSYvhJ0UlqScMdEnqCQNdknrCQJeknjDQJaknDHRJ6gkDXZJ6wkCXpJ4w0CWpJwx0SeoJA12SesJAl6SeMNAlqScMdEnqCQNdknrCQJeknugU6Ek2JTmQZDrJ9hHtv5bks83pE0keO/5SJUnzWTDQk0wAlwLnAeuBC5KsH+r2ZeCpVXUO8Cpg57gLlSTNr8sMfSMwXVUHq+oIsBvY3O5QVZ+oqpubzU8Bq8dbpiRpIV0CfRVwqLU90+yby28CHxjVkGRrkqkkU7Ozs92rlCQtqEugZ8S+Gtkx+TkGgf6yUe1VtbOqNlTVhsnJye5VSpIWdHKHPjPAmtb2auDwcKck5wBvBc6rqhvHU54kqasuM/R9wLokZydZAWwB9rQ7JHkY8B7gX1fVF8dfpiRpIQvO0KvqaJKLgSuACWBXVe1Psq1p3wG8AvgJ4M1JAI5W1YalK1uSNKzLkgtVtRfYO7RvR+v8bwG/Nd7SJEmL4SdFJaknDHRJ6gkDXZJ6wkCXpJ4w0CWpJwx0SeqJTm9blLR4a7e/f7lL0N3UV177zCUZ1xm6JPWEgS5JPWGgS1JPGOiS1BMGuiT1hIEuST1hoEtSTxjoktQTnQI9yaYkB5JMJ9k+ov3RST6Z5HtJ/mD8ZUqSFrLgJ0WTTACXAk9j8Pui+5LsqaprWt1uAl4CPGcpipQkLazLDH0jMF1VB6vqCLAb2NzuUFXXV9U+4PtLUKMkqYMugb4KONTanmn2LVqSrUmmkkzNzs4ezxCSpDl0CfSM2FfHc2VVtbOqNlTVhsnJyeMZQpI0hy6BPgOsaW2vBg4vTTmSpOPVJdD3AeuSnJ1kBbAF2LO0ZUmSFmvBd7lU1dEkFwNXABPArqran2Rb074jyUOAKeB04AdJ/i2wvqpuWbrSJUltnX7goqr2AnuH9u1onf86g6UYSdIy8ZOiktQTBrok9YSBLkk9YaBLUk8Y6JLUEwa6JPWEgS5JPWGgS1JPGOiS1BMGuiT1hIEuST1hoEtSTxjoktQTBrok9YSBLkk9YaBLUk90CvQkm5IcSDKdZPuI9iT506b9s0keP/5SJUnzWTDQk0wAlwLnAeuBC5KsH+p2HrCuOW0F/mzMdUqSFtBlhr4RmK6qg1V1BNgNbB7qsxn4yxr4FHD/JA8dc62SpHl0+U3RVcCh1vYM8MQOfVYB17U7JdnKYAYPcGuSA4uqVnNZCdyw3EXcXeSS5a5AI/gYbfkxH6NnzdXQJdAzYl8dRx+qaiews8N1ahGSTFXVhuWuQ5qLj9G7RpcllxlgTWt7NXD4OPpIkpZQl0DfB6xLcnaSFcAWYM9Qnz3Ahc27XZ4EfKuqrhseSJK0dBZccqmqo0kuBq4AJoBdVbU/ybamfQewF3gGMA18B7ho6UrWCC5j6e7Ox+hdIFV3WuqWJJ2A/KSoJPWEgS5JPWGgn8AW+koGabkl2ZXk+iSfX+5a7gkM9BNUx69kkJbbZcCm5S7insJAP3F1+UoGaVlV1UeAm5a7jnsKA/3ENdfXLUi6hzLQT1ydvm5B0j2HgX7i8usWJN2BgX7i6vKVDJLuQQz0E1RVHQWOfSXDF4B3VtX+5a1KuqMk7wA+CTwqyUyS31zumvrMj/5LUk84Q5eknjDQJaknDHRJ6gkDXZJ6wkCXpJ4w0CWpJwx0SeqJ/w9oUA9SvFxuRwAAAABJRU5ErkJggg==\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"y = df_train.sepse.value_counts()/df_train.sepse.value_counts().sum() #frequencia absoluta\n",
"plt.bar(['0','1'],y)\n",
"plt.title('Frequencia absoluta de Sepse')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Rq97bUIvMgR3"
},
"source": [
"## Pre processamento"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "3TaUFzkjMhgf",
"outputId": "d409deb9-fd8a-498b-908f-e46cb96af5bb"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Porcentagem de linhas duplicadas: 70.31\n"
]
}
],
"source": [
"nLinhas , nColunas = df_train.shape\n",
"\n",
"dupl = df_train.duplicated().sum()/nLinhas\n",
"\n",
"print('Porcentagem de linhas duplicadas:', round(dupl*100,2))"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 252
},
"id": "f98ldTEfMh3F",
"outputId": "8ad619dd-192d-45e9-b0d7-da77a1580f7f"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Frequencia treino:\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" Qtd Nan \n",
" Qtd Nan % \n",
" \n",
" \n",
" \n",
" \n",
" temperatura \n",
" 2145 \n",
" 12.66 \n",
" \n",
" \n",
" pulso \n",
" 2736 \n",
" 16.14 \n",
" \n",
" \n",
" respiracao \n",
" 13734 \n",
" 81.03 \n",
" \n",
" \n",
" pa_min \n",
" 8466 \n",
" 49.95 \n",
" \n",
" \n",
" pa_max \n",
" 8468 \n",
" 49.96 \n",
" \n",
" \n",
" sepse \n",
" 0 \n",
" 0.00 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Qtd Nan Qtd Nan %\n",
"temperatura 2145 12.66\n",
"pulso 2736 16.14\n",
"respiracao 13734 81.03\n",
"pa_min 8466 49.95\n",
"pa_max 8468 49.96\n",
"sepse 0 0.00"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Verificado a quantidade de NaN por atributo\n",
"nan_train= pd.DataFrame()\n",
"nan_train['Qtd Nan'] = df_train.isna().sum()\n",
"nan_train['Qtd Nan %'] = round(100*df_train.isna().sum()/len(df_train),2)\n",
"print('Frequencia treino:')\n",
"nan_train.head(6)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 221
},
"id": "I-ZDOJeNMp_9",
"outputId": "8fefacb9-891c-4f50-9c5c-6447cac1bed9"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Frequencia teste:\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" Qtd Nan \n",
" Qtd Nan % \n",
" \n",
" \n",
" \n",
" \n",
" temperatura \n",
" 903 \n",
" 12.37 \n",
" \n",
" \n",
" pulso \n",
" 1183 \n",
" 16.21 \n",
" \n",
" \n",
" respiracao \n",
" 6174 \n",
" 84.58 \n",
" \n",
" \n",
" pa_min \n",
" 3864 \n",
" 52.93 \n",
" \n",
" \n",
" pa_max \n",
" 3865 \n",
" 52.95 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Qtd Nan Qtd Nan %\n",
"temperatura 903 12.37\n",
"pulso 1183 16.21\n",
"respiracao 6174 84.58\n",
"pa_min 3864 52.93\n",
"pa_max 3865 52.95"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Verificado a quantidade de NaN por atributo\n",
"nan_test= pd.DataFrame()\n",
"nan_test['Qtd Nan'] = df_test.isna().sum()\n",
"nan_test['Qtd Nan %'] = round(100*df_test.isna().sum()/len(df_test),2)\n",
"print('Frequencia teste:')\n",
"nan_test.head(6)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HlhSQ0UUhn4r"
},
"source": [
"Pelo fato de haver uma quantidade grande de dados faltantes no conjunto de treino e teste, decidimos por utilizar o KNN Imputer para atribuirmos valores aos dados faltantes. O KNN Imputer, como o nome já informa, preenche os valores ausentes usando a abordagem k-vizinhos mais próximos."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Y2ELbkwwiqcG"
},
"source": [
"Antes de aplicarmos tal método de preenchimento, iremos verificar e tratar os atributos da base."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9fWPknVhUuyM"
},
"source": [
"### Tratando o campo temperatura"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"id": "G_mzXSAv6T8Y"
},
"outputs": [],
"source": [
"treino = df_train.copy()\n",
"teste = df_test.copy()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "7vbEP-ceULWt",
"outputId": "26afac32-3efe-4161-d106-cb8f85f7255b"
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 36. , 38. , 37. , 35. , 32. , nan, 39. , 35.8,\n",
" 0. , 34. , 36.7, 35.6, 356. , 40. , 131. , 36.8,\n",
" 35.1, 36.1, 35.4, 33.6, 37.6, 36.6, 35.5, 36.5,\n",
" 3602. , 37.2, 6. , 378. , 36.9, -35. , 36.2, 36.3,\n",
" 37.7, 33. , 85. , 336. , 368. , 37.5, 35.3, 35.7,\n",
" 35.9, 36.4])"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.temperatura.unique()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "nudz60jfAYcs",
"outputId": "fb58e7ae-51bb-4676-9f1c-7c9c8729dd2c"
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 36. , 37. , nan, 39. , 36.6, 38. , 36.7, 35. , 36.3,\n",
" 37.2, 34. , 33. , 0. , 37.5, 35.8, 37.7, 36.8, 35.3,\n",
" 36.1, 35.4, 35.5, 36.5, 37.6, 36.2, 35.1, 35.6, 36.9,\n",
" 378. , 36.4, -35. , 40. , 35.7, 35.9, 33.6])"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"teste.temperatura.unique()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Kd1cuqCbUzed"
},
"source": [
"Alguns valores como 3602, 378, por exemplo, entende-se que foi algum erro na hora de passar os dados para a planilha, todavia, iremos atribuir nan a esses vaores."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"id": "X7vpm4C0VFVa"
},
"outputs": [],
"source": [
"inf = 9e999 #Tranforma em infinito\n",
"treino.temperatura.replace({3602:inf-inf, 378:inf-inf, 336:inf-inf, 368:inf-inf, 356:inf-inf}, inplace=True)\n",
"teste.temperatura.replace(378, np.nan, inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"id": "vFFmKhjS4fNu"
},
"outputs": [],
"source": [
"treino.temperatura = treino.temperatura.apply(abs)\n",
"teste.temperatura = teste.temperatura.apply(abs)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vwBNNFpb5LAi"
},
"source": [
"Substituir temperatura nula por nan"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 419
},
"id": "h3-Si2HV4ml-",
"outputId": "a15461c1-0811-4cf3-c67e-14d99dde9178"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 151 \n",
" 0.0 \n",
" 121.0 \n",
" NaN \n",
" 180.0 \n",
" 90.0 \n",
" 1 \n",
" \n",
" \n",
" 1155 \n",
" 0.0 \n",
" 77.0 \n",
" NaN \n",
" 129.0 \n",
" 85.0 \n",
" 0 \n",
" \n",
" \n",
" 2115 \n",
" 0.0 \n",
" 84.0 \n",
" NaN \n",
" 109.0 \n",
" 81.0 \n",
" 0 \n",
" \n",
" \n",
" 2121 \n",
" 0.0 \n",
" 98.0 \n",
" NaN \n",
" 145.0 \n",
" 91.0 \n",
" 0 \n",
" \n",
" \n",
" 2328 \n",
" 0.0 \n",
" 94.0 \n",
" NaN \n",
" 130.0 \n",
" 81.0 \n",
" 0 \n",
" \n",
" \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" \n",
" \n",
" 16268 \n",
" 0.0 \n",
" 87.0 \n",
" 18.0 \n",
" 90.0 \n",
" 57.0 \n",
" 0 \n",
" \n",
" \n",
" 16278 \n",
" 0.0 \n",
" 87.0 \n",
" 18.0 \n",
" 90.0 \n",
" 57.0 \n",
" 0 \n",
" \n",
" \n",
" 16462 \n",
" 0.0 \n",
" 87.0 \n",
" 18.0 \n",
" 90.0 \n",
" 57.0 \n",
" 0 \n",
" \n",
" \n",
" 16513 \n",
" 0.0 \n",
" 87.0 \n",
" 18.0 \n",
" 90.0 \n",
" 57.0 \n",
" 0 \n",
" \n",
" \n",
" 16752 \n",
" 0.0 \n",
" 87.0 \n",
" 18.0 \n",
" 90.0 \n",
" 57.0 \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
194 rows × 6 columns
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"151 0.0 121.0 NaN 180.0 90.0 1\n",
"1155 0.0 77.0 NaN 129.0 85.0 0\n",
"2115 0.0 84.0 NaN 109.0 81.0 0\n",
"2121 0.0 98.0 NaN 145.0 91.0 0\n",
"2328 0.0 94.0 NaN 130.0 81.0 0\n",
"... ... ... ... ... ... ...\n",
"16268 0.0 87.0 18.0 90.0 57.0 0\n",
"16278 0.0 87.0 18.0 90.0 57.0 0\n",
"16462 0.0 87.0 18.0 90.0 57.0 0\n",
"16513 0.0 87.0 18.0 90.0 57.0 0\n",
"16752 0.0 87.0 18.0 90.0 57.0 0\n",
"\n",
"[194 rows x 6 columns]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('temperatura == 0')"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"id": "HJNGH7b35hFP"
},
"outputs": [],
"source": [
"treino.temperatura.replace(0, np.nan, inplace=True)\n",
"teste.temperatura.replace(0, np.nan, inplace=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3VD1YuWV50wz"
},
"source": [
"Substituir temperatura menor que 20 e maior ou igual a 45 por NaN"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"id": "vqywDzh45rb_",
"outputId": "cc182a13-9668-4847-f8d4-04606cbe1966"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 577 \n",
" 131.0 \n",
" 106.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 1 \n",
" \n",
" \n",
" 1843 \n",
" 6.0 \n",
" 101.0 \n",
" NaN \n",
" 158.0 \n",
" 89.0 \n",
" 0 \n",
" \n",
" \n",
" 4451 \n",
" 6.0 \n",
" 87.0 \n",
" NaN \n",
" 131.0 \n",
" 80.0 \n",
" 0 \n",
" \n",
" \n",
" 7123 \n",
" 85.0 \n",
" 85.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 11772 \n",
" 85.0 \n",
" 85.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"577 131.0 106.0 NaN NaN NaN 1\n",
"1843 6.0 101.0 NaN 158.0 89.0 0\n",
"4451 6.0 87.0 NaN 131.0 80.0 0\n",
"7123 85.0 85.0 NaN NaN NaN 0\n",
"11772 85.0 85.0 NaN NaN NaN 0"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('temperatura < 20 or temperatura >= 45')"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"id": "J8_mGsl056jH"
},
"outputs": [],
"source": [
"treino['temperatura'].values[treino['temperatura'].values < 20] = inf-inf\n",
"treino['temperatura'].values[treino['temperatura'].values >= 45] = inf-inf\n",
"\n",
"teste['temperatura'].values[teste['temperatura'].values < 20] = inf-inf\n",
"teste['temperatura'].values[teste['temperatura'].values >= 45] = inf-inf"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "JaHDmeRqcIBW",
"outputId": "c5a8b8c2-b031-4678-b1e7-a617a68077bf"
},
"outputs": [
{
"data": {
"text/plain": [
"array([36. , 38. , 37. , 35. , 32. , nan, 39. , 35.8, 34. , 36.7, 35.6,\n",
" 40. , 36.8, 35.1, 36.1, 35.4, 33.6, 37.6, 36.6, 35.5, 36.5, 37.2,\n",
" 36.9, 36.2, 36.3, 37.7, 33. , 37.5, 35.3, 35.7, 35.9, 36.4])"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.temperatura.unique()"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "enrTvyvgBBMQ",
"outputId": "32518b21-0d2c-4efe-ffe5-c287fd298a73"
},
"outputs": [
{
"data": {
"text/plain": [
"array([36. , 37. , nan, 39. , 36.6, 38. , 36.7, 35. , 36.3, 37.2, 34. ,\n",
" 33. , 37.5, 35.8, 37.7, 36.8, 35.3, 36.1, 35.4, 35.5, 36.5, 37.6,\n",
" 36.2, 35.1, 35.6, 36.9, 36.4, 40. , 35.7, 35.9, 33.6])"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"teste.temperatura.unique()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "kEUZyuL2c17z",
"outputId": "fea4112c-f4f3-4c45-d7c0-882388b989b4"
},
"outputs": [
{
"data": {
"text/plain": [
"count 14596.000000\n",
"mean 36.187188\n",
"std 0.922318\n",
"min 32.000000\n",
"25% 36.000000\n",
"50% 36.000000\n",
"75% 36.700000\n",
"max 40.000000\n",
"Name: temperatura, dtype: float64"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.temperatura.describe()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dLhbB8KT9M8c"
},
"source": [
"### Tratando o campo pulso"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "JVIyNBjI9Mex",
"outputId": "f3bfac25-d898-452e-8c68-37a50840469e"
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 117., 105., 118., 136., 104., 83., 91., 99., 67.,\n",
" 116., 107., 72., 85., 68., 102., 110., 66., 139.,\n",
" 74., 101., 120., 71., 144., nan, 96., 108., 92.,\n",
" 98., 94., 100., 114., 78., 87., 97., 106., 111.,\n",
" 115., 112., 64., 90., 103., 76., 124., 80., 119.,\n",
" 140., 75., 137., 82., 125., 134., 109., 122., 79.,\n",
" 95., 121., 130., 65., 69., 113., 73., 77., 131.,\n",
" 88., 81., 93., 126., 84., 56., 128., 170., 60.,\n",
" 38., 154., 123., 138., 155., 86., 127., 70., 52.,\n",
" 89., 132., 152., 62., 135., 61., 149., 179., 150.,\n",
" 183., 51., 165., 63., 143., 44., 129., 151., 156.,\n",
" 158., 55., 59., 153., 174., 186., 42., 40., 145.,\n",
" 32., 133., 43., 142., 54., 160., 53., 58., 57.,\n",
" 1000., 50., 46., 168., 30., 47., 49., 175., 11.,\n",
" 10., 166., 147., 0., 41., 163., 48., 177., 192.,\n",
" 157., 841., 148., 180., 200., 36., 159., 162.])"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.pulso.unique() #batimento cardíaco"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "IeoyGVC3BeBd",
"outputId": "dfa70521-35eb-4cfc-85c6-99f69bdc75a3"
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 93., 107., 90., 69., 71., nan, 104., 103., 99.,\n",
" 75., 82., 73., 84., 51., 98., 77., 76., 66.,\n",
" 80., 110., 97., 64., 100., 138., 81., 91., 74.,\n",
" 89., 96., 87., 135., 56., 95., 102., 88., 111.,\n",
" 101., 86., 68., 78., 83., 94., 128., 85., 62.,\n",
" 122., 119., 126., 42., 72., 79., 61., 92., 70.,\n",
" 112., 118., 120., 57., 54., 140., 105., 113., 114.,\n",
" 106., 131., 67., 55., 115., 63., 65., 125., 121.,\n",
" 155., 144., 130., 123., 109., 10., 142., 116., 169.,\n",
" 129., 136., 108., 117., 60., 127., 133., 44., 59.,\n",
" 137., 52., 124., 180., 139., 35., 50., 58., 38.,\n",
" 177., 156., 145., 47., 168., 132., 1116., 163., 43.,\n",
" 53., 46., 49., 178., 165., 134., 149., 704., 0.,\n",
" 11., 150., 170., 148., 151., 166., 147., 143., 160.,\n",
" 159., 158., 174., 153., 157., 162., 179., 183., 152.,\n",
" 23.])"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"teste.pulso.unique()"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 173
},
"id": "1uYmTtHs97sH",
"outputId": "25e25c26-3e37-478e-eda5-e89e240d7788"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 1616 \n",
" 36.0 \n",
" 1000.0 \n",
" NaN \n",
" 120.0 \n",
" 66.0 \n",
" 0 \n",
" \n",
" \n",
" 6326 \n",
" 36.0 \n",
" 841.0 \n",
" NaN \n",
" 128.0 \n",
" 74.0 \n",
" 0 \n",
" \n",
" \n",
" 8027 \n",
" 36.0 \n",
" 200.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 14473 \n",
" 36.0 \n",
" 841.0 \n",
" NaN \n",
" 128.0 \n",
" 74.0 \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"1616 36.0 1000.0 NaN 120.0 66.0 0\n",
"6326 36.0 841.0 NaN 128.0 74.0 0\n",
"8027 36.0 200.0 NaN NaN NaN 0\n",
"14473 36.0 841.0 NaN 128.0 74.0 0"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('pulso >= 200') #vamos considerar maior ou igual a 200 nan"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"id": "jlNzj7Wx_1tc"
},
"outputs": [],
"source": [
"treino['pulso'].values[treino['pulso'].values >= 200] = inf-inf\n",
"teste['pulso'].values[teste['pulso'].values >= 200] = inf-inf"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 483
},
"id": "A8VOhSK2AThE",
"outputId": "823811db-5d64-49dc-c3fd-8368c552e6b1"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 4172 \n",
" 36.0 \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 5523 \n",
" NaN \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 5912 \n",
" 35.0 \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 6087 \n",
" 35.0 \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 6952 \n",
" 37.0 \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 7058 \n",
" 34.0 \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 7910 \n",
" 37.0 \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 10652 \n",
" 38.0 \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 11120 \n",
" 36.0 \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 11156 \n",
" NaN \n",
" 0.0 \n",
" NaN \n",
" 90.0 \n",
" 48.0 \n",
" 0 \n",
" \n",
" \n",
" 11942 \n",
" 34.0 \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 13009 \n",
" NaN \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 15104 \n",
" NaN \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 15441 \n",
" 37.0 \n",
" 0.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"4172 36.0 0.0 NaN NaN NaN 0\n",
"5523 NaN 0.0 NaN NaN NaN 0\n",
"5912 35.0 0.0 NaN NaN NaN 0\n",
"6087 35.0 0.0 NaN NaN NaN 0\n",
"6952 37.0 0.0 NaN NaN NaN 0\n",
"7058 34.0 0.0 NaN NaN NaN 0\n",
"7910 37.0 0.0 NaN NaN NaN 0\n",
"10652 38.0 0.0 NaN NaN NaN 0\n",
"11120 36.0 0.0 NaN NaN NaN 0\n",
"11156 NaN 0.0 NaN 90.0 48.0 0\n",
"11942 34.0 0.0 NaN NaN NaN 0\n",
"13009 NaN 0.0 NaN NaN NaN 0\n",
"15104 NaN 0.0 NaN NaN NaN 0\n",
"15441 37.0 0.0 NaN NaN NaN 0"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('pulso == 0') #vamos considerar valor médio arredondado"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"id": "ta0DhOlj-2Ba"
},
"outputs": [],
"source": [
"treino.pulso.replace(0, np.nan, inplace=True)\n",
"teste.pulso.replace(0, np.nan, inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "POka3YI6CIXC",
"outputId": "b8fca6ad-2725-4eb5-c7dc-07f14c520b82"
},
"outputs": [
{
"data": {
"text/plain": [
"array([117., 105., 118., 136., 104., 83., 91., 99., 67., 116., 107.,\n",
" 72., 85., 68., 102., 110., 66., 139., 74., 101., 120., 71.,\n",
" 144., nan, 96., 108., 92., 98., 94., 100., 114., 78., 87.,\n",
" 97., 106., 111., 115., 112., 64., 90., 103., 76., 124., 80.,\n",
" 119., 140., 75., 137., 82., 125., 134., 109., 122., 79., 95.,\n",
" 121., 130., 65., 69., 113., 73., 77., 131., 88., 81., 93.,\n",
" 126., 84., 56., 128., 170., 60., 38., 154., 123., 138., 155.,\n",
" 86., 127., 70., 52., 89., 132., 152., 62., 135., 61., 149.,\n",
" 179., 150., 183., 51., 165., 63., 143., 44., 129., 151., 156.,\n",
" 158., 55., 59., 153., 174., 186., 42., 40., 145., 32., 133.,\n",
" 43., 142., 54., 160., 53., 58., 57., 50., 46., 168., 30.,\n",
" 47., 49., 175., 11., 10., 166., 147., 41., 163., 48., 177.,\n",
" 192., 157., 148., 180., 36., 159., 162.])"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.pulso.unique()"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 419
},
"id": "1FLgXNF5qN32",
"outputId": "f806f9fd-6d31-45ad-bfda-47925e700a62"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 36.0 \n",
" 117.0 \n",
" NaN \n",
" 113.0 \n",
" 72.0 \n",
" 1 \n",
" \n",
" \n",
" 2 \n",
" 38.0 \n",
" 118.0 \n",
" NaN \n",
" 110.0 \n",
" 70.0 \n",
" 1 \n",
" \n",
" \n",
" 3 \n",
" 37.0 \n",
" 136.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 1 \n",
" \n",
" \n",
" 9 \n",
" 37.0 \n",
" 116.0 \n",
" NaN \n",
" 111.0 \n",
" 70.0 \n",
" 1 \n",
" \n",
" \n",
" 15 \n",
" 37.0 \n",
" 110.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 1 \n",
" \n",
" \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" \n",
" \n",
" 10317 \n",
" 36.0 \n",
" 116.0 \n",
" NaN \n",
" 123.0 \n",
" 77.0 \n",
" 1 \n",
" \n",
" \n",
" 10322 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 10324 \n",
" 36.8 \n",
" 121.0 \n",
" 12.0 \n",
" 138.0 \n",
" 89.0 \n",
" 1 \n",
" \n",
" \n",
" 10325 \n",
" 36.8 \n",
" 110.0 \n",
" 19.0 \n",
" 141.0 \n",
" 61.0 \n",
" 1 \n",
" \n",
" \n",
" 10336 \n",
" 36.6 \n",
" 138.0 \n",
" 20.0 \n",
" 119.0 \n",
" 75.0 \n",
" 1 \n",
" \n",
" \n",
"
\n",
"
742 rows × 6 columns
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"0 36.0 117.0 NaN 113.0 72.0 1\n",
"2 38.0 118.0 NaN 110.0 70.0 1\n",
"3 37.0 136.0 NaN NaN NaN 1\n",
"9 37.0 116.0 NaN 111.0 70.0 1\n",
"15 37.0 110.0 NaN NaN NaN 1\n",
"... ... ... ... ... ... ...\n",
"10317 36.0 116.0 NaN 123.0 77.0 1\n",
"10322 NaN 131.0 39.0 82.0 37.0 1\n",
"10324 36.8 121.0 12.0 138.0 89.0 1\n",
"10325 36.8 110.0 19.0 141.0 61.0 1\n",
"10336 36.6 138.0 20.0 119.0 75.0 1\n",
"\n",
"[742 rows x 6 columns]"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('pulso >= 110 and sepse == 1') "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hLrx608GKqij"
},
"source": [
"### Tratando o campo respiração"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "l1YryOdmL7-p"
},
"source": [
""
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "01Fzf3eGKqK0",
"outputId": "95991511-e470-4ee2-f6e1-1f2b068ad5d2"
},
"outputs": [
{
"data": {
"text/plain": [
"array([nan, 15., 33., 17., 22., 19., 16., 12., 23., 26., 27., 28., 10.,\n",
" 39., 20., 21., 24., 14., 11., 40., 25., 18., 13., 0., 29.])"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.respiracao.unique()"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 111
},
"id": "zhc5fyRwNEFa",
"outputId": "999e7055-32df-43da-8c9a-8fc4d31bc1a1"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 5308 \n",
" 36.7 \n",
" 103.0 \n",
" 0.0 \n",
" 121.0 \n",
" 70.0 \n",
" 0 \n",
" \n",
" \n",
" 13783 \n",
" 36.7 \n",
" 103.0 \n",
" 0.0 \n",
" 121.0 \n",
" 70.0 \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"5308 36.7 103.0 0.0 121.0 70.0 0\n",
"13783 36.7 103.0 0.0 121.0 70.0 0"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('respiracao == 0') #zero igual a nan"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"id": "AAKbroooNQea"
},
"outputs": [],
"source": [
"treino.respiracao.replace(0, np.nan, inplace=True)\n",
"teste.respiracao.replace(0, np.nan, inplace=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AGWvPyxDMZ5U"
},
"source": [
"### Tratando o campo pressão minima e máxima"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "13DX24wWn6PT"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NveAD2IoGmW_"
},
"source": [
"Através dos valores, entendemos que pressão mínima da base é a sistólica e pressão máxima, Diastólica."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "LmwrVmmJMpV1",
"outputId": "88660508-384b-499d-bd93-8ddcb1af72e3"
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 113., nan, 110., 153., 111., 126., 107., 185., 127.,\n",
" 133., 97., 191., 130., 170., 143., 147., 120., 90.,\n",
" 122., 138., 152., 137., 142., 119., 125., 115., 204.,\n",
" 150., 128., 117., 118., 160., 100., 155., 149., 114.,\n",
" 159., 123., 108., 102., 145., 180., 136., 140., 169.,\n",
" 178., 144., 103., 105., 109., 88., 134., 101., 164.,\n",
" 167., 89., 135., 132., 148., 139., 131., 94., 200.,\n",
" 182., 86., 81., 146., 186., 121., 106., 93., 112.,\n",
" 189., 154., 187., 163., 129., 124., 141., 116., 10.,\n",
" 98., 156., 166., 151., 175., 92., 79., 104., 212.,\n",
" 85., 220., 91., 209., 95., 165., 157., 161., 171.,\n",
" 158., 82., 74., 168., 173., 96., 179., 162., 172.,\n",
" 99., 75., 219., 184., 190., 197., 230., 177., 202.,\n",
" 174., 15., 224., 208., 236., 40., 198., 199., 203.,\n",
" 12., 215., 201., 11., 217., 192., 181., 210., 195.,\n",
" 183., 188., 19., 176., 225., 233., 193., 16., 231.,\n",
" 80., 14., 213., 211., 18., 194., 69., 222., 83.,\n",
" 73., 67., 242., 245., 0., 17., 1460., 20.])"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.pa_min.unique()"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 111
},
"id": "c_G2SaouP71e",
"outputId": "a270b354-2534-4adc-f894-5acb846ef270"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 11982 \n",
" NaN \n",
" 63.0 \n",
" 15.0 \n",
" 0.0 \n",
" 0.0 \n",
" 0 \n",
" \n",
" \n",
" 13753 \n",
" NaN \n",
" 63.0 \n",
" 15.0 \n",
" 0.0 \n",
" 0.0 \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"11982 NaN 63.0 15.0 0.0 0.0 0\n",
"13753 NaN 63.0 15.0 0.0 0.0 0"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('pa_min == 0') #zero igual a nan"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"id": "e4RNXEYrQHZI"
},
"outputs": [],
"source": [
"treino.pa_min.replace(0, np.nan, inplace=True)\n",
"teste.pa_min.replace(0, np.nan, inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 607
},
"id": "Adr1WAHuQPog",
"outputId": "e384a97b-5faf-425a-d06f-c7c87b096093"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 1566 \n",
" 36.0 \n",
" 100.0 \n",
" NaN \n",
" 230.0 \n",
" 120.0 \n",
" 0 \n",
" \n",
" \n",
" 1765 \n",
" 35.0 \n",
" 85.0 \n",
" NaN \n",
" 224.0 \n",
" 101.0 \n",
" 0 \n",
" \n",
" \n",
" 2452 \n",
" NaN \n",
" 175.0 \n",
" NaN \n",
" 236.0 \n",
" 116.0 \n",
" 0 \n",
" \n",
" \n",
" 3083 \n",
" 36.0 \n",
" 75.0 \n",
" NaN \n",
" 230.0 \n",
" 100.0 \n",
" 0 \n",
" \n",
" \n",
" 3153 \n",
" 36.0 \n",
" 84.0 \n",
" NaN \n",
" 230.0 \n",
" 112.0 \n",
" 0 \n",
" \n",
" \n",
" 4751 \n",
" 35.0 \n",
" 68.0 \n",
" NaN \n",
" 225.0 \n",
" 97.0 \n",
" 0 \n",
" \n",
" \n",
" 4987 \n",
" 36.0 \n",
" 83.0 \n",
" NaN \n",
" 233.0 \n",
" 105.0 \n",
" 0 \n",
" \n",
" \n",
" 5273 \n",
" 36.0 \n",
" 80.0 \n",
" NaN \n",
" 231.0 \n",
" 140.0 \n",
" 0 \n",
" \n",
" \n",
" 7658 \n",
" 35.0 \n",
" 90.0 \n",
" NaN \n",
" 222.0 \n",
" 118.0 \n",
" 0 \n",
" \n",
" \n",
" 8749 \n",
" NaN \n",
" 85.0 \n",
" NaN \n",
" 242.0 \n",
" 122.0 \n",
" 0 \n",
" \n",
" \n",
" 10457 \n",
" 35.0 \n",
" 88.0 \n",
" NaN \n",
" 245.0 \n",
" 100.0 \n",
" 0 \n",
" \n",
" \n",
" 11334 \n",
" 36.0 \n",
" 83.0 \n",
" NaN \n",
" 233.0 \n",
" 105.0 \n",
" 0 \n",
" \n",
" \n",
" 11792 \n",
" 35.0 \n",
" 88.0 \n",
" NaN \n",
" 245.0 \n",
" 100.0 \n",
" 0 \n",
" \n",
" \n",
" 14329 \n",
" 36.0 \n",
" 100.0 \n",
" NaN \n",
" 230.0 \n",
" 120.0 \n",
" 0 \n",
" \n",
" \n",
" 14534 \n",
" 36.0 \n",
" 90.0 \n",
" NaN \n",
" 1460.0 \n",
" 100.0 \n",
" 0 \n",
" \n",
" \n",
" 15247 \n",
" 35.0 \n",
" 88.0 \n",
" NaN \n",
" 245.0 \n",
" 100.0 \n",
" 0 \n",
" \n",
" \n",
" 16735 \n",
" 36.0 \n",
" 84.0 \n",
" NaN \n",
" 230.0 \n",
" 112.0 \n",
" 0 \n",
" \n",
" \n",
" 16838 \n",
" 35.0 \n",
" 88.0 \n",
" NaN \n",
" 245.0 \n",
" 100.0 \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"1566 36.0 100.0 NaN 230.0 120.0 0\n",
"1765 35.0 85.0 NaN 224.0 101.0 0\n",
"2452 NaN 175.0 NaN 236.0 116.0 0\n",
"3083 36.0 75.0 NaN 230.0 100.0 0\n",
"3153 36.0 84.0 NaN 230.0 112.0 0\n",
"4751 35.0 68.0 NaN 225.0 97.0 0\n",
"4987 36.0 83.0 NaN 233.0 105.0 0\n",
"5273 36.0 80.0 NaN 231.0 140.0 0\n",
"7658 35.0 90.0 NaN 222.0 118.0 0\n",
"8749 NaN 85.0 NaN 242.0 122.0 0\n",
"10457 35.0 88.0 NaN 245.0 100.0 0\n",
"11334 36.0 83.0 NaN 233.0 105.0 0\n",
"11792 35.0 88.0 NaN 245.0 100.0 0\n",
"14329 36.0 100.0 NaN 230.0 120.0 0\n",
"14534 36.0 90.0 NaN 1460.0 100.0 0\n",
"15247 35.0 88.0 NaN 245.0 100.0 0\n",
"16735 36.0 84.0 NaN 230.0 112.0 0\n",
"16838 35.0 88.0 NaN 245.0 100.0 0"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('pa_min > 220') #recebe nan"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"id": "p54aMfdHRPBk"
},
"outputs": [],
"source": [
"treino['pa_min'].values[treino['pa_min'].values > 220] = inf-inf\n",
"teste['pa_min'].values[teste['pa_min'].values > 220] = inf-inf"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"id": "6VwqbRv1RbKw",
"outputId": "5e29532e-81b4-4b11-e28d-ad8a3a2167c8"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 442 \n",
" 37.0 \n",
" 125.0 \n",
" NaN \n",
" 10.0 \n",
" 75.0 \n",
" 1 \n",
" \n",
" \n",
" 519 \n",
" 36.0 \n",
" 106.0 \n",
" NaN \n",
" 79.0 \n",
" 52.0 \n",
" 1 \n",
" \n",
" \n",
" 884 \n",
" 37.0 \n",
" 117.0 \n",
" NaN \n",
" 74.0 \n",
" 56.0 \n",
" 1 \n",
" \n",
" \n",
" 1118 \n",
" NaN \n",
" 51.0 \n",
" NaN \n",
" 75.0 \n",
" 52.0 \n",
" 0 \n",
" \n",
" \n",
" 1706 \n",
" 36.0 \n",
" 71.0 \n",
" NaN \n",
" 15.0 \n",
" 89.0 \n",
" 0 \n",
" \n",
" \n",
" 2474 \n",
" 35.0 \n",
" 74.0 \n",
" NaN \n",
" 40.0 \n",
" 69.0 \n",
" 0 \n",
" \n",
" \n",
" 2739 \n",
" 36.0 \n",
" 95.0 \n",
" NaN \n",
" 12.0 \n",
" 69.0 \n",
" 0 \n",
" \n",
" \n",
" 3108 \n",
" 38.0 \n",
" 85.0 \n",
" NaN \n",
" 11.0 \n",
" 75.0 \n",
" 0 \n",
" \n",
" \n",
" 3270 \n",
" 35.0 \n",
" 84.0 \n",
" NaN \n",
" 11.0 \n",
" 79.0 \n",
" 0 \n",
" \n",
" \n",
" 3292 \n",
" 35.0 \n",
" 92.0 \n",
" NaN \n",
" 15.0 \n",
" 84.0 \n",
" 0 \n",
" \n",
" \n",
" 4278 \n",
" 37.0 \n",
" 117.0 \n",
" NaN \n",
" 74.0 \n",
" 56.0 \n",
" 0 \n",
" \n",
" \n",
" 4496 \n",
" 36.0 \n",
" 84.0 \n",
" NaN \n",
" 19.0 \n",
" 82.0 \n",
" 0 \n",
" \n",
" \n",
" 4863 \n",
" 36.0 \n",
" 100.0 \n",
" NaN \n",
" 12.0 \n",
" 80.0 \n",
" 0 \n",
" \n",
" \n",
" 4961 \n",
" 36.0 \n",
" 67.0 \n",
" NaN \n",
" 10.0 \n",
" 71.0 \n",
" 0 \n",
" \n",
" \n",
" 5106 \n",
" 36.0 \n",
" 62.0 \n",
" NaN \n",
" 16.0 \n",
" 90.0 \n",
" 0 \n",
" \n",
" \n",
" 5522 \n",
" 36.0 \n",
" 81.0 \n",
" NaN \n",
" 14.0 \n",
" 10.0 \n",
" 0 \n",
" \n",
" \n",
" 5636 \n",
" 35.0 \n",
" 90.0 \n",
" NaN \n",
" 11.0 \n",
" 75.0 \n",
" 0 \n",
" \n",
" \n",
" 5774 \n",
" 36.0 \n",
" 94.0 \n",
" NaN \n",
" 10.0 \n",
" 60.0 \n",
" 0 \n",
" \n",
" \n",
" 6360 \n",
" 36.0 \n",
" 100.0 \n",
" NaN \n",
" 18.0 \n",
" 93.0 \n",
" 0 \n",
" \n",
" \n",
" 7089 \n",
" 35.0 \n",
" 89.0 \n",
" NaN \n",
" 69.0 \n",
" 60.0 \n",
" 0 \n",
" \n",
" \n",
" 7401 \n",
" 36.0 \n",
" 62.0 \n",
" NaN \n",
" 18.0 \n",
" 74.0 \n",
" 0 \n",
" \n",
" \n",
" 7777 \n",
" 33.0 \n",
" 53.0 \n",
" NaN \n",
" 73.0 \n",
" 43.0 \n",
" 0 \n",
" \n",
" \n",
" 7843 \n",
" 35.0 \n",
" 180.0 \n",
" NaN \n",
" 67.0 \n",
" 43.0 \n",
" 0 \n",
" \n",
" \n",
" 8424 \n",
" 36.0 \n",
" 100.0 \n",
" NaN \n",
" 16.0 \n",
" 10.0 \n",
" 0 \n",
" \n",
" \n",
" 9777 \n",
" 37.0 \n",
" 125.0 \n",
" NaN \n",
" 10.0 \n",
" 75.0 \n",
" 1 \n",
" \n",
" \n",
" 10069 \n",
" 36.0 \n",
" 106.0 \n",
" NaN \n",
" 79.0 \n",
" 52.0 \n",
" 1 \n",
" \n",
" \n",
" 12320 \n",
" 38.0 \n",
" 85.0 \n",
" NaN \n",
" 11.0 \n",
" 75.0 \n",
" 0 \n",
" \n",
" \n",
" 13078 \n",
" 36.0 \n",
" 67.0 \n",
" NaN \n",
" 10.0 \n",
" 71.0 \n",
" 0 \n",
" \n",
" \n",
" 13548 \n",
" 33.0 \n",
" 53.0 \n",
" NaN \n",
" 73.0 \n",
" 43.0 \n",
" 0 \n",
" \n",
" \n",
" 13834 \n",
" 36.0 \n",
" 85.0 \n",
" NaN \n",
" 17.0 \n",
" 92.0 \n",
" 0 \n",
" \n",
" \n",
" 14568 \n",
" 36.0 \n",
" 62.0 \n",
" NaN \n",
" 18.0 \n",
" 74.0 \n",
" 0 \n",
" \n",
" \n",
" 15837 \n",
" 36.0 \n",
" 71.0 \n",
" NaN \n",
" 15.0 \n",
" 89.0 \n",
" 0 \n",
" \n",
" \n",
" 15936 \n",
" 36.0 \n",
" 68.0 \n",
" NaN \n",
" 20.0 \n",
" 10.0 \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"442 37.0 125.0 NaN 10.0 75.0 1\n",
"519 36.0 106.0 NaN 79.0 52.0 1\n",
"884 37.0 117.0 NaN 74.0 56.0 1\n",
"1118 NaN 51.0 NaN 75.0 52.0 0\n",
"1706 36.0 71.0 NaN 15.0 89.0 0\n",
"2474 35.0 74.0 NaN 40.0 69.0 0\n",
"2739 36.0 95.0 NaN 12.0 69.0 0\n",
"3108 38.0 85.0 NaN 11.0 75.0 0\n",
"3270 35.0 84.0 NaN 11.0 79.0 0\n",
"3292 35.0 92.0 NaN 15.0 84.0 0\n",
"4278 37.0 117.0 NaN 74.0 56.0 0\n",
"4496 36.0 84.0 NaN 19.0 82.0 0\n",
"4863 36.0 100.0 NaN 12.0 80.0 0\n",
"4961 36.0 67.0 NaN 10.0 71.0 0\n",
"5106 36.0 62.0 NaN 16.0 90.0 0\n",
"5522 36.0 81.0 NaN 14.0 10.0 0\n",
"5636 35.0 90.0 NaN 11.0 75.0 0\n",
"5774 36.0 94.0 NaN 10.0 60.0 0\n",
"6360 36.0 100.0 NaN 18.0 93.0 0\n",
"7089 35.0 89.0 NaN 69.0 60.0 0\n",
"7401 36.0 62.0 NaN 18.0 74.0 0\n",
"7777 33.0 53.0 NaN 73.0 43.0 0\n",
"7843 35.0 180.0 NaN 67.0 43.0 0\n",
"8424 36.0 100.0 NaN 16.0 10.0 0\n",
"9777 37.0 125.0 NaN 10.0 75.0 1\n",
"10069 36.0 106.0 NaN 79.0 52.0 1\n",
"12320 38.0 85.0 NaN 11.0 75.0 0\n",
"13078 36.0 67.0 NaN 10.0 71.0 0\n",
"13548 33.0 53.0 NaN 73.0 43.0 0\n",
"13834 36.0 85.0 NaN 17.0 92.0 0\n",
"14568 36.0 62.0 NaN 18.0 74.0 0\n",
"15837 36.0 71.0 NaN 15.0 89.0 0\n",
"15936 36.0 68.0 NaN 20.0 10.0 0"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('pa_min < 80')"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "86vAYmDJMng6",
"outputId": "69f62307-5e64-4658-8e66-98fe3e239867"
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 72., nan, 70., 118., 56., 76., 64., 75., 51., 92., 107.,\n",
" 87., 112., 99., 114., 59., 46., 67., 83., 82., 84., 96.,\n",
" 78., 69., 66., 123., 68., 79., 60., 102., 80., 54., 90.,\n",
" 73., 74., 65., 77., 91., 111., 62., 93., 71., 110., 63.,\n",
" 53., 58., 61., 81., 85., 48., 89., 41., 57., 100., 55.,\n",
" 86., 50., 88., 47., 52., 97., 140., 101., 126., 94., 108.,\n",
" 109., 124., 95., 106., 113., 115., 37., 45., 116., 98., 11.,\n",
" 103., 117., 120., 119., 104., 20., 49., 130., 131., 105., 10.,\n",
" 8., 121., 133., 150., 135., 43., 0., 969., 122., 44., 883.,\n",
" 704.])"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.pa_max.unique()"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 173
},
"id": "0MOC4FutSH4d",
"outputId": "e458f0e2-d457-4333-ca5d-9c1fb2c6c8cf"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 7847 \n",
" 36.0 \n",
" 94.0 \n",
" NaN \n",
" 121.0 \n",
" 0.0 \n",
" 0 \n",
" \n",
" \n",
" 10827 \n",
" 36.0 \n",
" 94.0 \n",
" NaN \n",
" 121.0 \n",
" 0.0 \n",
" 0 \n",
" \n",
" \n",
" 11982 \n",
" NaN \n",
" 63.0 \n",
" 15.0 \n",
" NaN \n",
" 0.0 \n",
" 0 \n",
" \n",
" \n",
" 13753 \n",
" NaN \n",
" 63.0 \n",
" 15.0 \n",
" NaN \n",
" 0.0 \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"7847 36.0 94.0 NaN 121.0 0.0 0\n",
"10827 36.0 94.0 NaN 121.0 0.0 0\n",
"11982 NaN 63.0 15.0 NaN 0.0 0\n",
"13753 NaN 63.0 15.0 NaN 0.0 0"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('pa_max == 0') #zero igual a nan"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"id": "3GFDKbI7SVi9"
},
"outputs": [],
"source": [
"treino.pa_max.replace(0, np.nan, inplace=True)\n",
"teste.pa_max.replace(0, np.nan, inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 142
},
"id": "-ZfhndkASclt",
"outputId": "4e75a9ae-47bd-464c-d4d8-53dec3033f40"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 8605 \n",
" 36.0 \n",
" 100.0 \n",
" NaN \n",
" 145.0 \n",
" 969.0 \n",
" 0 \n",
" \n",
" \n",
" 14740 \n",
" NaN \n",
" 92.0 \n",
" NaN \n",
" 153.0 \n",
" 883.0 \n",
" 0 \n",
" \n",
" \n",
" 15696 \n",
" NaN \n",
" 126.0 \n",
" NaN \n",
" 197.0 \n",
" 704.0 \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"8605 36.0 100.0 NaN 145.0 969.0 0\n",
"14740 NaN 92.0 NaN 153.0 883.0 0\n",
"15696 NaN 126.0 NaN 197.0 704.0 0"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('pa_max > 180') #recebe nan"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"id": "m4Z9NY2-S3MB"
},
"outputs": [],
"source": [
"treino['pa_max'].values[treino['pa_max'].values > 180] = inf-inf\n",
"teste['pa_max'].values[teste['pa_max'].values > 180] = inf-inf"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fG6SooqKjxrV"
},
"source": [
"Também substituímos as variáveis contínuas da pressão por dados categóricos, distribuídos nas faixas da imagem que trouxemos logo no início, porém os resultados foram piores."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "E_JmfjhzTiYD"
},
"source": [
"Medidas para os casos de sepse positiva"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"id": "FPBiW5wmTp1v"
},
"outputs": [],
"source": [
"sepse1 = treino.query('sepse == 1')"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 297
},
"id": "Ovt4dq1FTxx_",
"outputId": "e339102a-2df0-404c-e0dc-709bc4a25808"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" count \n",
" 2336.000000 \n",
" 2406.000000 \n",
" 1046.000000 \n",
" 1692.000000 \n",
" 1691.000000 \n",
" 2575.0 \n",
" \n",
" \n",
" mean \n",
" 36.335916 \n",
" 99.599751 \n",
" 20.817400 \n",
" 125.989953 \n",
" 67.604376 \n",
" 1.0 \n",
" \n",
" \n",
" std \n",
" 1.136880 \n",
" 20.942447 \n",
" 5.250119 \n",
" 23.701485 \n",
" 16.540542 \n",
" 0.0 \n",
" \n",
" \n",
" min \n",
" 32.000000 \n",
" 32.000000 \n",
" 10.000000 \n",
" 10.000000 \n",
" 11.000000 \n",
" 1.0 \n",
" \n",
" \n",
" 25% \n",
" 35.100000 \n",
" 84.000000 \n",
" 18.000000 \n",
" 111.000000 \n",
" 53.000000 \n",
" 1.0 \n",
" \n",
" \n",
" 50% \n",
" 36.100000 \n",
" 101.000000 \n",
" 19.000000 \n",
" 124.000000 \n",
" 67.000000 \n",
" 1.0 \n",
" \n",
" \n",
" 75% \n",
" 37.000000 \n",
" 112.000000 \n",
" 22.000000 \n",
" 143.000000 \n",
" 79.000000 \n",
" 1.0 \n",
" \n",
" \n",
" max \n",
" 40.000000 \n",
" 186.000000 \n",
" 40.000000 \n",
" 220.000000 \n",
" 140.000000 \n",
" 1.0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"count 2336.000000 2406.000000 1046.000000 1692.000000 1691.000000 2575.0\n",
"mean 36.335916 99.599751 20.817400 125.989953 67.604376 1.0\n",
"std 1.136880 20.942447 5.250119 23.701485 16.540542 0.0\n",
"min 32.000000 32.000000 10.000000 10.000000 11.000000 1.0\n",
"25% 35.100000 84.000000 18.000000 111.000000 53.000000 1.0\n",
"50% 36.100000 101.000000 19.000000 124.000000 67.000000 1.0\n",
"75% 37.000000 112.000000 22.000000 143.000000 79.000000 1.0\n",
"max 40.000000 186.000000 40.000000 220.000000 140.000000 1.0"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sepse1.describe()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2_4oYkjAjjwN"
},
"source": [
"## Utilizando o KNN imputer para o preenchimento dos dados nan"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"id": "uql42hNfjhas"
},
"outputs": [],
"source": [
"#Divide the features into Independent and Dependent Variable\n",
"X = treino.drop('sepse' , axis =1)\n",
"y = treino['sepse']\n",
"y_completo = y.copy()"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"id": "fRKC-aKKijCh"
},
"outputs": [],
"source": [
"# knn imputation treino\n",
"from numpy import isnan\n",
"from sklearn.impute import KNNImputer\n",
"\n",
"# define imputer\n",
"imputer = KNNImputer()\n",
"# fit on the dataset\n",
"imputer.fit(X)\n",
"# transform the dataset\n",
"Xtrans = imputer.transform(X)"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "nrQM2xSmbMRW",
"outputId": "352df3ef-0552-447d-c11f-b5d700897cf9"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 36. , 117. , 18. , 113. , 72. ],\n",
" [ 36. , 105. , 19. , 134.8, 85.4],\n",
" [ 38. , 118. , 18. , 110. , 70. ],\n",
" ...,\n",
" [ 35.4, 69. , 23.4, 149. , 93. ],\n",
" [ 35.2, 95. , 19. , 136. , 82. ],\n",
" [ 37. , 88. , 20. , 141.8, 63.4]])"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Xtrans"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"id": "CE_7eTrZbU4o"
},
"outputs": [],
"source": [
"#vontando treino para o tipo dataframe\n",
"basey = pd.DataFrame()\n",
"basey['sepse'] = y\n",
"basex = pd.DataFrame(Xtrans, columns=X.columns)"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"id": "B7asQkcQd7eS"
},
"outputs": [],
"source": [
"#Juntando Treino pós imputação\n",
"treino_full = pd.concat([basex, basey], axis=1)\n",
"treino_completo = treino_full"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 252
},
"id": "vnXjCOAqgCoI",
"outputId": "f5deb5e3-ccb6-497e-bf46-5e09b96eb398"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Frequencia treino:\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" Qtd Nan \n",
" Qtd Nan % \n",
" \n",
" \n",
" \n",
" \n",
" temperatura \n",
" 0 \n",
" 0.0 \n",
" \n",
" \n",
" pulso \n",
" 0 \n",
" 0.0 \n",
" \n",
" \n",
" respiracao \n",
" 0 \n",
" 0.0 \n",
" \n",
" \n",
" pa_min \n",
" 0 \n",
" 0.0 \n",
" \n",
" \n",
" pa_max \n",
" 0 \n",
" 0.0 \n",
" \n",
" \n",
" sepse \n",
" 0 \n",
" 0.0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Qtd Nan Qtd Nan %\n",
"temperatura 0 0.0\n",
"pulso 0 0.0\n",
"respiracao 0 0.0\n",
"pa_min 0 0.0\n",
"pa_max 0 0.0\n",
"sepse 0 0.0"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Verificado a quantidade de NaN por atributo\n",
"nan_train= pd.DataFrame()\n",
"nan_train['Qtd Nan'] = treino_full.isna().sum()\n",
"nan_train['Qtd Nan %'] = round(100*treino_full.isna().sum()/len(treino_full),2)\n",
"print('Frequencia treino:')\n",
"nan_train.head(6)"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"id": "fe9aFAMikgBL",
"outputId": "2590fea3-78e5-44da-92a4-a21f56267077"
},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAY4AAAEmCAYAAAB1S3f/AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAA1lElEQVR4nO3de1wUZf8//tfCCuKBDAMpRAXTVDJFrTzgWVQUVPC0YYBSaZ7yxgOBYiqmeRvddRNa3uXDTCkPGFkq3trBQ2GZ3KbySdBEFETwgIKgLIe9vn/wY367sCCju+wCr+fj4UN2duba91w7M++95pq5RiGEECAiIqolC1MHQERE9QsTBxERycLEQUREsjBxEBGRLEwcREQkCxMHERHJwsRB1crMzETXrl0xfvx4jB8/Hj4+Ppg8eTKSkpKM8nnPPfcccnNza5znyJEj+Pe///1YnxMWFobNmzdX+/nnzp3DW2+9VWMZZ8+exTvvvPNYcZiLzZs3IywszOif8/vvv8Pb29von0PGpzR1AGTemjZtir1790qvDxw4gPDwcBw6dMgk8Zw7dw55eXlG/Yzu3bsjOjq6xnn+/vtv5OTkGDUOInPFxEGy3L17F/b29tLrnTt3Ytu2bbCwsMBTTz2F5cuXo3379pgxYwbc3NwQGhqKxMREhIWF4ZtvvkFUVBSsra2RkpKC27dvY8CAAYiIiECTJk10PmfDhg3Yv38/LC0t4eLiguXLlyMrKws7duxAWVkZWrZsiZCQEJ1lLl26hDVr1uDu3bsoKytDQEAAJk2aJHsdf//9d6xevRr79u3DqVOnsG7dOmg0GgDArFmz8MILLyA6Ohr37t1DeHg43nvvPb314OLigtzcXISHh+Pq1ato1aoV7O3t0alTJ8yfPx/PP/88hg8fjpSUFERFRSE1NRU7d+5ESUkJ8vLy8MYbb8Df3x/ffPMNDh06BI1Gg6ysLLRp0wZTpkzB9u3bkZ6ejhkzZiA4OBj379/HypUrceXKFdy9exfNmzdHVFQUXF1dddavpKQE7777LhITE9G6dWu0bt0aLVu2BADcu3cPa9aswYULF1BSUoJ+/fohNDQUSqXuoSIsLKza7/G5557DiRMnYGdnBwDSa2366nXUqFG4d+8eVq1ahZSUFCgUCgwcOBALFy6s8vlkYoKoGhkZGaJLly5i3LhxYty4cWLIkCHCzc1NHDlyRAghRGJiohgxYoS4ffu2EEKIPXv2CC8vL6HRaEROTo7o37+/OHz4sBg4cKA4efKkEEKIt99+W0yYMEEUFBQItVotpk2bJrZt2yaEEKJz587i9u3bIi4uTkydOlUUFhYKIYSIjo4WwcHB0t+rVq2qEmtJSYkYM2aMSE5OFkIIkZ+fL7y8vMTp06erzPv2228LDw8Pab0q/lV8/m+//SbGjh0rhBAiMDBQ7Nu3TwghxPnz58XKlSuldZ05c+ZD6yEkJESsX79eCCFETk6OGDBggIiOjpbWNz4+XgghREFBgZgyZYrIzc0VQghx+vRp0bNnT6m83r17i6ysLFFWVibGjBkj5s+fL8rKysT58+dF9+7dRVlZmUhISBCrV6+W1nP58uUiMjKyyvp/8cUXIjAwUKjValFYWCh8fX3F22+/LYQQIiwsTHz55ZdCCCFKS0vF4sWLxX/+8x+9dfiw77GCnHoNDQ0Vq1evFhqNRqjVahEcHCw2bdpU5fPJtJjGqUaVT1UlJiZi7ty5+O6773D8+HGMGTNG+mXp5+eHNWvWIDMzE87Ozli9ejXmzJmD+fPn48UXX5TK8PX1RfPmzQEA48ePx48//ohXX31Vev/YsWPw8/NDs2bNAACBgYH49NNPUVxcXG2c6enpuHr1KpYuXSpNKyoqwl9//YWePXtWmX/69Ol47bXXdKY999xzVebz8vJCZGQkfvrpJ/Tv3x8LFy6sMk9N9XD06FHEx8cDABwcHDB69GidZfv06QMAaN68OT799FMcPXoU6enpSElJwf3796X5unfvjqeffhoA0LZtW3h4eMDCwgLOzs5Qq9V48OABRo8eDWdnZ2zbtg1XrlzByZMn4e7uXiXeEydOwNvbG1ZWVrCysoKPjw9SU1MBlPchnTt3DnFxcVIdVudh32NNqqvXY8eO4euvv4ZCoYCVlRVUKhW2bt2KmTNn1qpcqhtMHCRL//790a5dO5w7d046zaBNCIHS0lIA5f0ATz31FM6ePaszj6Wlpc78Fha612hoNBooFAqd1xVlVqfi9JV2krt165Z0CuZRqVQqDB06FL/++iuOHz+OmJgYHDx4sEq8lVXUg1KphNAaDq7yulYkx+zsbEydOhVTpkxB7969MXr0aPz888/SfFZWVjrL6Tt189VXX2HXrl2YNm0afHx80KpVK2RmZj50HbW/D41Gg3//+9/o2LEjACA/P1/nu6huOX3fI4Bqk3119foo3z3VPV5VRbJcvnwZ165dQ9euXTFw4EAcOHBAuhJqz549aNWqFdq3b4+zZ8/iyy+/xJ49e3Dv3j1s3bpVKiMhIQHFxcVQq9WIj4/H0KFDdT5j4MCB2LNnj/SLe9u2bXjxxRdhZWUFS0tLvQcSFxcXndbR9evX4e3tjeTk5MdaX5VKhfPnz8PPzw+rV69Gfn4+bt68qRNHTfUwePBg6df7nTt38MMPP+g9ECcnJ8POzg5z5syBh4eHlDTKyspqHesvv/wCX19fTJ48GS4uLvjpp5/0Lj9w4EB8++23UKvVUKvVOHDggPSeh4cHvvjiCwghUFxcjNmzZ2P79u16P6+679HOzg7nzp0DAOzbt0/vstXVq4eHB7Zv3y59/q5du9C/f/9a1wHVDbY4qEZFRUUYP3689Fqj0SAyMhIuLi5wcXHB9OnTERQUBI1GAzs7O2zatAn379/HwoULERERgTZt2mDdunWYPHmydLqqadOm8Pf3R35+PkaNGoWJEyfqfOakSZNw/fp1TJ48GRqNBu3bt0dUVBQAoG/fvli8eDFWr16N5cuXS8tYWVlh48aNWLNmDT7//HOUlpZiwYIF6N2792Ot/+LFi7F27Vp89NFHUCgUmDdvHtq2bYuysjJs2LAB8+bNQ0xMjN56sLCwQHh4OCIiIqQWwDPPPIOmTZtW+ZwBAwYgLi4Oo0ePhkKhwEsvvQQ7OztcuXKl1rEGBwfjnXfekRJVz549ceHChSrzqVQqXL16Fd7e3lKCq7Bs2TKsWbMGPj4+KCkpQf/+/fH666/r/bzqvseIiAhERkbC1tYW/fv317mY4mH1GhERgXfffVf6/IEDB+LNN9+sdR1Q3VAIwWHVqe6EhYWhU6dOVfoXGqrY2Fh069YN7u7uKC4uhr+/P+bPn4/BgwebOrTH0ti+R9LFFgeRET377LNYvXo1NBoNSkpKMHr06HqfNIjY4iAiIlnYOU5ERLIwcRARkSxMHEREJAsTBzU4mZmZeOONN/Diiy9iwIABiIyM1Ln349KlSwgMDETv3r3h6emJw4cP11je/v374eXlhZ49e2LEiBE4deqU9F5AQAC6d+8Od3d3uLu7Y9SoUUZbr9ravn07/Pz88Pzzz1cZ9fZhdSOnrMWLF8PDwwO9evXCqFGjsHv3bqOsD5kfXlVFDc6qVavQunVr/PLLL8jPz0dwcDC++uorBAYGorS0FHPmzIFKpcKWLVtw8uRJzJ49G/Hx8XBxcalS1q+//oqoqCh8+OGHeOGFF3Dz5s0q87zzzjuYPHlyXaxarTg4OGDOnDk4fvw41Gq1zns11Y3csmbNmoW1a9fCyspKSsZdu3bF888/b7R1I/PAFgeZzCeffIIVK1ZIr/Py8uDm5lblACVXZmYmvLy8YG1tDXt7e3h4eODvv/8GAKSlpeHGjRuYPn06LC0t0a9fP/Tq1UtnqBJtH3/8MebMmYOePXvCwsICbdq0QZs2bR4rPqB8hNoPP/wQw4YNg5ubG5577jk899xzGDdu3GOXPXLkSIwYMQKtWrWq8l5NdSO3rE6dOklDoSgUCigUCly9evWx4yfzxxYHmcyFCxfw8ssvS6/Pnz8PFxcXWFtb68w3a9asah8e1bt3b2zatElnWmBgIPbv34+XXnoJ+fn5OH78OBYsWAAA0Hf1uRACFy9erDK9rKwMycnJGDZsGDw9PaFWqzFixAiEhobq3P39wQcfICoqCi4uLggJCdFZp+p89NFH+OOPPxAbG4snnngCc+bMQYsWLfD2228/1ro/TE118yhWrlyJ+Ph4FBUVoVu3brxHpZFg4iCTuXDhAqZPny69TklJ0TtCrdyD40svvYTdu3ejd+/eKCsrg6+vL0aMGAEAcHV1hZ2dHT7//HNMnz4dv//+O/744w+9B/tbt26hpKQEBw8eRGxsLJRKJebMmYNPPvlEehbI4sWL0bFjR1hZWWH//v148803sXfvXrRr167a+AoKCrBt2zZ899130oi3I0eOREJCApydnR9r3R+mprp5FCtXrsTy5ctx+vRpnDx5sspgjNQw8VQVmURxcTGuXr2Kzp07S9NSUlLQtWvXxypXo9Hgtddeg6enJ/7880/89ttvyMvLw/vvvw8AaNKkCTZs2ICjR4/Cw8MDW7ZswejRo/WefqpoVQQEBMDBwQF2dnaYMWMGjh49Ks3To0cPtGjRAlZWVvD19UWvXr103tfn1KlTcHZ2RocOHaRp+fn5eOqppx5r3R/mYXXzqCwtLdGnTx9kZ2fj66+/NlC0ZM7Y4iCTuHTpEtq0aQMbGxsA5aeLTp48qfeZ1K+//nqNp2s+//xz6fXdu3dx/fp1vPrqq9LzJiZOnIiPPvoIoaGhAIAuXbrojPiqUqkwYcKEKmU/8cQTcHR0rHZYcX0UCoXe02HacnNzYWtrK70WQuDw4cN6O6jlrPvD1KZuHkdZWRn7OBoJJg4yidTUVNy+fRtXr16Fg4MDPvnkE1y7dg1OTk5V5pVzcLSzs0Pbtm3x9ddfS49TjY+P1zkFlpKSAhcXF2g0Gnz11Ve4ceMG/Pz89Jbn5+eHbdu2YeDAgVAqldi6dSuGDBkCoLyVcObMGbz00kuwtLTEgQMHcOrUKZ2HSenTqVMn/PXXX1KfTkxMDBQKBcaMGfNY616htLQUZWVl0Gg0KCsrg1qthqWlZa3qprZl5eXl4bfffsOQIUPQtGlTJCYmYv/+/dIoxtSwcawqMon169cjMzMTqampuH//PmbOnImtW7eid+/e+Oc///lYZZ8/fx5r165FSkoKLCws8PLLL2PFihVo3bo1AOCf//wn4uLiUFpait69e0vPSQfKf+H36dNHGsq7pKQEa9aswb59+2BtbQ0vLy8sWbIE1tbWyM3NxRtvvIG0tDRYWlrC1dUVCxYswIABA6RY3njjDahUKgwfPlwnxk8++QSxsbEAyp+BERoaKj1B8HF9/PHHiImJ0Zk2b948zJ8//6F1U3n9qytr2rRpeOutt5CSkgKNRgMnJycEBARgypQpBlkHMm9MHGQSr7/+OiZPnmwWN8wZ065du+Do6IhBgwaZOhQig2HnOJnEhQsXpMeTNmQV94oQNSRscVCdy8vLw4ABA3D69Gk0adLE1OEQkUxMHEREJAtPVRERkSxMHEREJAsTBxERycLEQUREsjBxEBGRLEwcREQkCxMHERHJwsRBRESyMHEQEZEsTBxERCQLEwcREcnCBzlRvfXZZ58hLS3NYOXduXMHAPDkk08apDxXV1e88cYbBimLyJwwcVC9lZaWhuS/UmHZtJVByisrugsAyL5TbLCyiBoiJg6q1yybtkKz9sMfPmMt3L/yIwAYpLyKsogaIvZxEBGRLEwcREQkCxMHERHJwsRBRESyMHEQEZEsTBxERCQLEwcREcnCxEFERLIwcRARkSxMHEREJAsTBxERycLEQUREsjBxEBGRLEwcREQkCxMHERHJwsRBRESyMHEQEZEsTBwN1E8//YSffvrJ1GFQA8ZtrPHio2MbqMOHDwMAhg0bZuJIqKHiNtZ4scVBRESyMHEQEZEsTBxERCQLEwcREcnCxEFERLIwcRARkSxMHEREJAsTBxERycLEQUREsjBxEBGRLEwcREQkCxMHERHJwsRBRESyMHEQEZEsjTpxpKWlYerUqYiLi4OPjw9++eUXk8SRm5uLsLAw3LlzR4rp8uXLVebTfu/06dMYP348zpw5I/19/PhxhIWF4fjx40hOTsa9e/dMsDbUmKSnp8PHxwf+/v7w8fGBn58ffHx89P6bMmWKtK1W7HM+Pj5YsmSJtP9VvJ+QkICpU6fiwIEDVcqZMGGC9HdcXBymTp2KL7/8Upr2yy+/SPvBuHHjMH/+fLzzzjvw8fHBmjVrpP3t2LFj0j4ElO+HCxYswJQpU6T9T99+VjG/Nn3Lau/XlffRO3fuSMseO3asyvGnpuNA5c+tXF5dUAghRJ1+ohmZM2cOMjIypNdKpRLx8fF1HsfGjRtx8OBBeHl54dy5c8jIyEC7du2wYcMGnfkq4m3Xrh1u376NwsJCtGjRAkIIFBYWQqlUoqysDJaWligtLYWFhQX27t1b5+tTV8LDw3E+LQfN2g83SHn3r/wIAAYp7/6VH9HVtQ3ee++9xy7LXIWHhyM5OVnWMs2bN0dhYaHe95RKJaytrVFYWAiFQgEhhPS/HJX3g8q8vLxw8OBB6f0WLVrg66+/xsaNG5GQkAAA0v6nUqmq7GcV82vTt6z2fn306FGdfdTLywuzZ88GAPj6+qK0tFTn+KO9r1c+DlT+3IrPqCivLjTaFkdaWppO0gCA0tLSOm915Obm4scff4QQAocPH5Ziunr1qs6vDe14r169Ku18BQUF0t+lpaUQQkg7i0aj0fvriMgQ0tPTZS9TXdIAyrffivcrksWj/K6tvB9UdvDgQZ33CwoKcPz4cenBVED5PpaQkKB3PysoKNDZr3Jzc6sse+bMGWm/PnToUJV99IcffsCdO3dw7NgxKY6K40/lfb26Vof2saOivLrSaFsclVsbFeq61bFx40YcPnxY70au/WujungfxsLCAt26dXvsOM1RWloaHpRaokXHsQYpz5AtjoJL+2GjLIOrq+tjl2Wu5LY2zJlSqayyD9bU2tFudWi3NrTfLyoqqjZ5KZVKjBw5EocOHdKZR6lU4umnn9bZ16trdWgfOyrKq6tWR6NtcVR3EK7uizaWI0eOVPuZV69elf5+lKQBlLc6iKhm+vbBmn5TFxQUSH8fOXJE7/s1HUtKS0vx888/V5mntLS0yr6ufRzQpn3sqCivrjTaZ447OztX2+KoS0OGDKmxxVGhungfpkWLFg32PHtFH4c5slA2hWsD7+Pw8fExdQgG8ygtjgpDhgx5pBbH0KFDa93i0Ef72FFRXl1ptC2OxYsX652+aNGiOo1DpVLBwqL8a2jSpInOe9oxVhfvw4SFhT16cEQ10D541icKhaLKtIULF1b50VjTaR/t/UqlUlVZNiwsTNqv9f0YtbCwgEqlQkhIiM70RYsWVdnXq9v3tY8dFeXVlUabOFxdXeHs7KwzTalUwsPDo07jsLOzw/Dhw6FQKODp6SnF1K5dO7i4uOiNt127dmjevDmA8p234m+lUgmFQiFtqBYWFujRo0ddrg41Ih06dJC9TMW2qo9SqZTerzi46zvIP0zl/aCy0aNH67zfokULDBw4EJ6entI87dq1g5eXl979rEWLFjr7lZ2dXZVle/ToIe3XI0eOrLKPjhgxAk8++SQGDRokxVFx/Km8r2sfB7RpHzsqyqsrjTZxAOWZvFmzZggKCgJQ962NCiqVCt26dYNKpZJi0vcrQ/u9t99+GxYWFggLC5P+XrhwIbp164aFCxcCQJXESGRoFa2Oli1bAqjaatZmY2MjbasV+xwAdOnSBUD5/lfx/uzZs9GsWTO8+eabVcqxtLSU/g4KCkKzZs0wefJkadqiRYuk/UChUKBDhw5wd3cHAPTt21fa30JCQqR9CCjfD11dXWFjYyPtf/r2M32teH3Lau/XlfdR7dZBRatD+/hT03Gg8udWLq8uNNqrqhq68PBwAGjQ59h5H4dpNYZtjPRr1C0OIiKSj4mDiIhkYeIgIiJZmDiIiEgWJg4iIpKFiYOIiGRh4iAiIlmYOIiISBYmDiIikoWJg4iIZGHiICIiWZg4iIhIFiYOIiKShYmDiIhkabSPjm3otB8sQ2QM3MYaLyaOBmrYsGGmDoEaOG5jjRdPVRERkSxMHEREJAsTBxERycLEQUREsjBxEBGRLEwcREQkCxMHERHJwsRBRESyMHEQEZEsTBxERCQLEwcREcnCxEFERLIwcRARkSxMHEREJAsTBxERycLEQUREsjBxEBGRLHwCINVrZUV3cf/KjwYrC4BByisvq81jl0Nkjpg4qN5ydXU1aHl37lgBAJ588kkDlNbG4PERmQuFEEKYOggiIqo/2MdBRESyMHEQEZEsTBxERCQLEwcREcnCxEFERLIwcRARkSxMHEREJEu9vwGwtLQU2dnZpg6DiKhecnR0hFIpLxXU+8SRnZ2N4cOHmzoMIqJ66ccff0Tbtm1lLVPv7xw39xZHdnY2pk2bhtjYWDg6Opo6nGoxTsNinIZTH2IE6m+cjbLFoVQqZWdLU3B0dGScBsQ4Das+xFkfYgQaR5zsHCciIlmYOIiISBYmDiIikoWJw8hsbW0xb9482NramjqUGjFOw2KchlMfYgQaV5z1/qoqIiKqW2xxEBGRLEwcREQkS72/j8McxcTEICEhAQAwePBghIaGIjw8HElJSbCxsQEAzJs3D56eniaLMSAgALm5udKNP5GRkSgsLMR7770HtVoNLy8vhISEmCw+ANi9eze2b98uvc7MzMT48ePx4MEDs6nLgoICqFQqfPrpp2jbti0SExP11uH58+exbNkyFBYWok+fPli1apXsm64MGefOnTuxbds2KBQKPP/881i1ahWsrKwQExODPXv2SOe/p0yZgmnTppkkxur2GXOqy0uXLuFf//qX9F5OTg569OiBTZs2mbQu9R2DDLptCjKoX3/9VUydOlWo1WpRXFwsAgMDxaFDh4S3t7fIyckxdXhCCCE0Go3w8PAQJSUl0rQHDx6IwYMHi6tXr4qSkhIRHBwsjhw5YsIodV24cEF4enqK27dvm01d/vnnn8Lb21u4ubmJjIyMGutw7Nix4vTp00IIIcLDw0VsbKzJ4kxLSxOenp7i3r17QqPRiNDQULFlyxYhhBCzZs0S//vf/+ostupiFEJU+z2bU11qu3Hjhhg+fLi4fPmyEMJ0danvGPT9998bdNvkqSoDs7e3R1hYGKysrNCkSRN07NgRWVlZyMrKwtKlS+Hj44Po6GhoNBqTxZiWlgYACA4Oxrhx47B9+3acPXsW7du3h7OzM5RKJXx8fHDw4EGTxVjZypUrERISAhsbG7Opy127dmHFihVwcHAAgGrr8Nq1aygqKkLPnj0BAH5+fnVat5XjtLKywooVK9CiRQsoFAp07twZWVlZAIDk5GRs2rQJPj4+iIyMhFqtNkmMDx480Ps9m1tdalu/fj1UKhU6dOgAwHR1qe8YlJ6ebtBtk4nDwDp16iR9Cenp6UhISMDAgQPRt29frF27Frt27cKpU6cQFxdnshjz8/PRr18/bNiwAV988QV27NiBrKws2NvbS/M4ODggJyfHZDFqS0xMRFFREby8vHDr1i2zqcs1a9agT58+0usbN27orcPK0+3t7eu0bivH6eTkhAEDBgAAcnNzERsbi+HDh6OwsBBdu3bFkiVLEB8fj/z8fGzcuNEkMVb3PZtbXVZIT0/HyZMnERgYCAAmrUt9xyCFQmHQbZOJw0guXryI4OBghIaGwtXVFRs2bICDgwNsbGwQEBCAo0ePmiw2d3d3rF+/Hi1btoSdnR0mTZqE6OhoKBQKaR4hhM5rU9qxYwdmzJgBAHB2djarutSm0Wj01mF1000tJycHQUFBmDhxIl5++WU0b94cn332GTp27AilUong4GCT1W1137O51uXOnTvh7+8PKysrADCLutQ+Bjk7Oxt022TiMIKkpCRMnz4dixYtgq+vL1JTU/Hf//5Xel8IUaedeZWdOnUKJ06c0InHyckJN2/elKbdvHlTb3O8rhUXF+OPP/7AsGHDAMDs6lKbo6Oj3jqsPP3WrVsmr9tLly5BpVLB19cXc+fOBQBkZWXptN5MWbfVfc/mWJdA+dDkY8aMkV6bui4rH4MMvW0ycRjY9evXMXfuXERFRWHs2LEAyjeatWvXIi8vDyUlJdi5c6dJr6i6d+8e1q9fD7VajYKCAsTHx2PhwoW4fPkyrly5grKyMuzbtw+DBg0yWYwVUlNT0aFDBzRr1gyA+dWlth49euitQycnJ1hbWyMpKQkAsHfvXpPWbUFBAV577TUsWLAAwcHB0vSmTZvi/fffR0ZGBoQQiI2NNVndVvc9m1tdAuWn+4qKiuDs7CxNM2Vd6jsGGXrbNI+fag3I5s2boVarsW7dOmmaSqXCzJkz8corr6C0tBQjR46Et7e3yWIcOnQozpw5gwkTJkCj0cDf3x/u7u5Yt24d5s+fD7VajcGDB2P06NEmi7FCRkaGzrMNunTpYlZ1qc3a2rraOoyKikJERAQKCgrg5uYmnQs3hbi4ONy6dQtbtmzBli1bAADDhg3DggULEBkZidmzZ6OkpAS9evWSThHWtZq+Z3OqS6D8MvHKz9+ws7MzWV1Wdwwy5LZZ74ccqXiQ06M8jISIiOSr96eqKh4da85PASQiakjqfeIgIqK6xcRBRESyMHEQEZEsTBxERCQLL0Mieoit+/9Cbn4R7GybImhsN1OHQ2RyTBxED5GbX4Sbdx6YOgwis8FTVUREJAsTBxERycLEQUREsrCPgxqEig5sAOzEJjIyJg5qENiBTVR3eKqKiIhkYeIgIiJZmDiIiEgWJg4iIpKFiYOIiGRh4iAiIll4OS6RAXAgRGpMmDiIDID3kVBjwlNVREQki9FaHLt378b27dul15mZmRg/fjwePHiApKQk2NjYAADmzZsHT09PnD9/HsuWLUNhYSH69OmDVatWQalkg4iIyNwY7cg8efJkTJ48GQBw8eJFzJ07F/PmzUNQUBC2b98OBwcHnfmXLFmCd999Fz179sTSpUuxa9cu+Pv7Gys8IiJ6RHVyqmrlypUICQmBjY0NsrKysHTpUvj4+CA6OhoajQbXrl1DUVERevbsCQDw8/PDwYMHq5STn5+PzMxMnX/Z2dl1sQpERPT/Mfq5oMTERBQVFcHLywsZGRno27cvVqxYgZYtW2LWrFmIi4tDp06dYG9vLy1jb2+PnJycKmVt3boVMTExxg6ZiIhqYPTEsWPHDsyYMQMA4OzsjA0bNkjvBQQE4Ntvv0XHjh2hUCik6UIIndcVgoKC4OvrqzMtOzsb06ZNM1L0RERUmVETR3FxMf744w+sW7cOAJCamor09HSMGjUKQHmCUCqVcHR0xM2bN6Xlbt26VaUPBABsbW1ha2trzJCJiOghjNrHkZqaig4dOqBZs2YAyhPF2rVrkZeXh5KSEuzcuROenp5wcnKCtbU1kpKSAAB79+7FoEGDjBkaERE9IqO2ODIyMuDo6Ci97tKlC2bOnIlXXnkFpaWlGDlyJLy9vQEAUVFRiIiIQEFBAdzc3BAYGGjM0IiI6BEZNXGMGTMGY8aM0Zk2bdo0vX0SXbp0QVxcnDHDISIiA+Cd40REJAsTBxERycIxPajR4oi2RI+GiYMaLY5oS/RoeKqKiIhkYeKgRk/PIAVEVINaJY6lS5dWmfbWW28ZPBgiU3iyZVNs3f8XPvz6f9i6/y9Th0Nk9mrs41ixYgVycnKQlJSE3NxcaXppaSkyMjKMHhxRXWF/B1Ht1Zg4Jk2ahIsXLyI1NVUaXwoALC0tpSHQiYiocakxcXTv3h3du3dH//79dYYOISKixqtWl+Nev34dS5YsQV5eHoQQ0vTvv//eaIEREZF5qlXieOedd+Dn54du3brpfU4GERE1HrVKHEqlUnoYExERNW61uhy3U6dOSE1NNXYsRERUD9SqxZGRkYGJEyfimWeegbW1tTSdfRxERI1PrRJHSEiIseMgIqJ6olaJo3Pnzo9UeEBAAHJzc6FUln9MZGQkCgsL8d5770GtVsPLy0tKSufPn8eyZctQWFiIPn36YNWqVdJyRBUqRrRt/URTBI6p2xFteV0IUblaHZn79u0LhUIBIYR0VZW9vT2OHTtW7TJCCKSnp+Pnn3+WEkBRURFGjx6Nbdu24emnn8asWbNw9OhRDB48GEuWLMG7776Lnj17YunSpdi1axf8/f0NsIrUkFTc4W1n21RKIi7P2NbJZ1cMTcKh2Kmxq1XiSElJkf4uLi7Gvn37cPny5RqXSUtLAwAEBwfj7t27mDJlCjp37oz27dvD2dkZAODj44ODBw/i2WefRVFRkXQ3up+fH6Kjo6skjvz8fOTn5+tMy87Ors0qUAOknUS01dQyeNxkw6FJiB7heRxWVlbw8/ODn58fFi1aVO18+fn56NevH5YvX46SkhIEBgbi9ddfh729vTSPg4MDcnJycOPGDZ3p9vb2yMnJqVLm1q1bERMTIzdkamRqahlUl2yIqPZqlTju3r0r/S2EQHJycpVf/pW5u7vD3d1dej1p0iRER0ejd+/eOmUpFApoNBqdGwu1T4lpCwoKgq+vr8607OxsTJs2rTarQY0IWwZExiO7jwMAWrdujWXLltW4zKlTp1BSUoJ+/foBKE8GTk5OuHnzpjTPzZs34eDgAEdHR53pt27dgoODQ5UybW1tYWtbN+eziYhIP9l9HLV17949REdHY8eOHSgpKUF8fDxWrVqFf/zjH7hy5Qratm2Lffv2YeLEiXBycoK1tTWSkpLQu3dv7N27F4MGDZL9mdRwmKoTmldOET1crRKHRqPB5s2bcezYMZSWlmLAgAF48803a7xcdujQoThz5gwmTJgAjUYDf39/uLu7Y926dZg/fz7UajUGDx6M0aNHAwCioqIQERGBgoICuLm5ITAw0DBrSPWSqU41afePAKizK7aI6pNaJY4PPvgAKSkpCAoKgkajwc6dO7F+/Xq9TwbU9o9//AP/+Mc/dKb169cP3333XZV5u3Tpgri4uNpHTo2CKVoA2knL0J3ovJyXGoJaJY7jx49jz549aNKkCQBgyJAhGDdu3EMTB9Hj0m4BmNOv/0dNaOy0p4agVolDCCElDaD8klzt10TGZI6X0BozobFVQuauVomjS5cuWLt2LV599VUoFAps27btkYchIWooapPQHqW/hK0SMne1GlZ9xYoVyM/Ph0qlwuTJk3Hnzh0sX77c2LER1XsVSeDmnQfIKyg2dThEBlFj4iguLsbbb7+NEydOYN26dUhMTMQLL7wAS0tLtGjRoq5iJCIiM1Jj4oiOjkZBQQF69eolTVu9ejXy8/Px8ccfGz04IiIyPzUmjiNHjuCDDz5A69atpWlt2rTB+vXr8cMPPxg9OGp4tu7/Cx9+/T9s3f+XqUMhokdUY+d4kyZN0LRp1Y6/Fi1awMrKymhBUcPFjt9yvEOd6rMaE4eFhQUKCgqq9GcUFBSgtLTUqIFR42OO92sYC5/tQfVZjaeqvL29ERERgfv370vT7t+/j4iICIwcOdLowVHjUtEaaSxXH1Wsb8XlukT1RY2JIygoCC1btsSAAQMwZcoUTJo0CQMGDICtrS3mzp1bVzESEZEZeeipqtWrV+PNN9/E//3f/8HCwgIvvPCC3iHPieTgOX6i+qtWd447OTnBycnJ2LFQI8JRaInqL9mPjiUyFGOOQluBLRsiw2PioAatrkfXZaKixoCJgxq8uhxd11yHgScyJKMmjpiYGCQkJAAABg8ejNDQUISHhyMpKQk2NjYAgHnz5sHT0xPnz5/HsmXLUFhYiD59+mDVqlU1PmGQyFyZ4zDwRIZktCNzYmIifvnlF8THx0OhUOD111/H4cOHkZycjO3bt1e5MmvJkiV499130bNnTyxduhS7du2Cv7+/scIjIqJHVKth1R+Fvb09wsLCpIc+dezYEVlZWcjKysLSpUvh4+OD6OhoaDQaXLt2DUVFRejZsycAwM/PDwcPHqxSZn5+PjIzM3X+ZWdnG2sViIhID6O1ODp16iT9nZ6ejoSEBMTGxuLkyZNYsWIFWrZsiVmzZiEuLg6dOnWCvb29NL+9vT1ycnKqlLl161bExMQYK2QiIqoFo3ciXLx4EbNmzUJoaChcXV2xYcMG6b2AgAB8++236NixIxRal6MIIXReVwgKCoKvr6/OtOzsbEybNs14K0BERDqMmjiSkpLw1ltvYenSpRg7dixSU1ORnp6OUaNGAShPEEqlEo6Ojrh586a03K1bt/TenW5rawtbW16pQkRkSkbr47h+/Trmzp2LqKgojB07FkB5oli7di3y8vJQUlKCnTt3wtPTE05OTrC2tkZSUhIAYO/evRg0aJCxQiMiosdgtBbH5s2boVarsW7dOmmaSqXCzJkz8corr6C0tBQjR46Et7c3ACAqKgoREREoKCiAm5sbAgMDjRUaERE9BqMljoiICEREROh9T1+fRJcuXRAXF2escIjMFu82p/qGd9iRUfAhRbXHu82pvmHiIIPRPvjxEbHy8G5zqk+YOMhgePAzLu1h6Fs/0RSBY9iSI9Ng4iCqJyoPQ1+RSJhEqK4xcRCZqYd1mmu38NinRHWJiYPITMnpNK+uT4kJhYyBiYOMipeaPp7H7TfiRQpkDEwcZFS81JSo4THakCNEFSp+9eYVFJs6lEaLLT8yJLY4iBoB7ZYfAPZ50GNh4iBqJB6lv4Od66QPEwdRI1TbU1fsXCd9mDjosbDj27zUNiFon7pia4LkYuKgh9JODnkFxdJ5cu0xqTjMiHl4lHs/2HFOcjFx0ENpJ4fKw16Q+ZGbzCsnm4ofB2xFUnWYOBqoxz0NwVNQjYu+HwfaiUdOq0RfEuLpsIbFrBLH999/j08++QSlpaUICgrS+8Anqp3H7dTkKSjSpt0q0Teoor4h9Su3UB8X+2TMh9kkjpycHHz44Yf45ptvYGVlBZVKhZdffhnPPvusqUNrNNjKoJroG1QRMG5fF5/xYp7MJnEkJiaib9++aNWqFQBg1KhROHjwIObNmyfNk5+fj/z8fJ3lrl27BgDIzs6us1jrA4vSfFhDDYvSEmRmZtZqmbu5N5B/T41cqwewKC2BNdQoK4LevwFU+57cv1lW3ZZliFju3S9B/r3ysmqzvVTeDn/8IwMF99Vo0cwaw190lrVNytmmtdX2MxsbR0dHKJXyUoFCCCGMFI8smzZtwv379xESEgIA2L17N86ePYvVq1dL83z88ceIiYkxVYhERA3Ojz/+iLZt28paxmxaHBqNBgqtHjghhM5rAAgKCoKvr6/OtOLiYmRkZKBDhw6wtLSsk1jlyM7OxrRp0xAbGwtHR0dTh1MtxmlYjNNw6kOMQP2N81FiNZvE4ejoiFOnTkmvb968CQcHB515bG1tYWtb9fy7q6ur0eN7XI6OjrKzuikwTsNinIZTH2IEGkecZjM6bv/+/XHixAnk5ubiwYMHOHToEAYNGmTqsIiIqBKzaXG0adMGISEhCAwMRElJCSZNmoQXXnjB1GEREVElZpM4AMDHxwc+Pj6mDoOIiGpgNqeqGipbW1vMmzdPb9+MOWGchsU4Dac+xAg0rjjN5nJcIiKqH9jiICIiWZg4iIhIFrPqHG8oYmJikJCQAAAYPHgwQkNDER4ejqSkJNjY2AAA5s2bB09PT5PFGBAQgNzcXGmogcjISBQWFuK9996DWq2Gl5eXdBe/qezevRvbt2+XXmdmZmL8+PF48OCB2dRlQUEBVCoVPv30U7Rt2xaJiYl66/D8+fNYtmwZCgsL0adPH6xatUr2MA+GjHPnzp3Ytm0bFAoFnn/+eaxatQpWVlaIiYnBnj17pPPfU6ZMqbPBRivHWN0+Y051eenSJfzrX/+S3svJyUGPHj2wadMmk9alvmOQQbdNQQb166+/iqlTpwq1Wi2Ki4tFYGCgOHTokPD29hY5OTmmDk8IIYRGoxEeHh6ipKREmvbgwQMxePBgcfXqVVFSUiKCg4PFkSNHTBilrgsXLghPT09x+/Zts6nLP//8U3h7ews3NzeRkZFRYx2OHTtWnD59WgghRHh4uIiNjTVZnGlpacLT01Pcu3dPaDQaERoaKrZs2SKEEGLWrFnif//7X53FVl2MQohqv2dzqkttN27cEMOHDxeXL18WQpiuLvUdg77//nuDbps8VWVg9vb2CAsLg5WVFZo0aYKOHTsiKysLWVlZWLp0KXx8fBAdHQ2NRmOyGNPS0gAAwcHBGDduHLZv346zZ8+iffv2cHZ2hlKphI+PDw4ePGiyGCtbuXIlQkJCYGNjYzZ1uWvXLqxYsUIa4aC6Orx27RqKiorQs2dPAICfn1+d1m3lOK2srLBixQq0aNECCoUCnTt3RlZWFgAgOTkZmzZtgo+PDyIjI6FWq00S44MHD/R+z+ZWl9rWr18PlUqFDh06ADBdXeo7BqWnpxt022TiMLBOnTpJX0J6ejoSEhIwcOBA9O3bF2vXrsWuXbtw6tQpxMXFmSzG/Px89OvXDxs2bMAXX3yBHTt2ICsrC/b29tI8Dg4OyMnJMVmM2hITE1FUVAQvLy/cunXLbOpyzZo16NOnj/T6xo0beuuw8nR7e/s6rdvKcTo5OWHAgAEAgNzcXMTGxmL48OEoLCxE165dsWTJEsTHxyM/Px8bN240SYzVfc/mVpcV0tPTcfLkSQQGBgKASetS3zFIoVAYdNtk4jCSixcvIjg4GKGhoXB1dcWGDRvg4OAAGxsbBAQE4OjRoyaLzd3dHevXr0fLli1hZ2eHSZMmITo6+qGDTJrKjh07MGPGDACAs7OzWdWltuoG6qzNAJ6mkJOTg6CgIEycOBEvv/wymjdvjs8++wwdO3aEUqlEcHCwyeq2uu/ZXOty586d8Pf3h5WVFQCYRV1qH4OcnZ0Num0ycRhBUlISpk+fjkWLFsHX1xepqan473//K70vhKjTzrzKTp06hRMnTujE4+TkhJs3b0rT9A0yaQrFxcX4448/MGzYMAAwu7rU5ujoqLcOK0+/deuWyev20qVLUKlU8PX1xdy5cwEAWVlZOq03U9Ztdd+zOdYlUD40+ZgxY6TXpq7LyscgQ2+bTBwGdv36dcydOxdRUVEYO3YsgPKNZu3atcjLy0NJSQl27txp0iuq7t27h/Xr10OtVqOgoADx8fFYuHAhLl++jCtXrqCsrAz79u0zi0EmU1NT0aFDBzRr1gyA+dWlth49euitQycnJ1hbWyMpKQkAsHfvXpPWbUFBAV577TUsWLAAwcHB0vSmTZvi/fffR0ZGBoQQiI2NNVndVvc9m1tdAuWn+4qKiuDs/P8/HMqUdanvGGTobdM8fqo1IJs3b4Zarca6deukaSqVCjNnzsQrr7yC0tJSjBw5Et7e3iaLcejQoThz5gwmTJgAjUYDf39/uLu7Y926dZg/fz7UajUGDx6M0aNHmyzGChkZGTrPC+jSpYtZ1aU2a2vrauswKioKERERKCgogJubm3Qu3BTi4uJw69YtbNmyBVu2bAEADBs2DAsWLEBkZCRmz56NkpIS9OrVSzpFWNdq+p7NqS6B8svEKz/Tws7OzmR1Wd0xyJDbJoccISIiWXiqioiIZGHiICIiWZg4iIhIFiYOIiKShYmDiIhkYeIgMqJhw4bh3Llzpg6DyKCYOIiISBbeAEhUS7///juioqLwzDPPIC0tDU2bNsW6devw2WefoVOnTnjttdcAAGFhYTqvgfJB78LDw3HlyhVYWFjAzc0NkZGRsLCwkJ6NYWFhgaeeegrLly+Hi4uLqVaT6KHY4iCSITk5GQEBAfj+++/h5+eHJUuW1Gq5w4cPo7CwEHv37pXGMMrIyMCJEyfw+eef48svv8R3330Hb29vzJ07F7wvl8wZEweRDF26dJGG1Z44cSLOnz+Pu3fvPnS53r174++//0ZAQAD+85//ICgoCO3bt8fx48cxZswY2NnZASh/HkJOTg4yMzONuRpEj4WJg0gGS0vLKtOefPJJnRZCSUlJlXmcnZ1x+PBhzJw5EwUFBZgxYwZ++uknvQ+hEkKgtLTUsIETGRATB5EMKSkpSElJAVD+DAZ3d3c8+eSTSE5OBlD+jIuTJ09WWe6rr75CeHg4PDw8sGTJEnh4eOCvv/7CwIEDceDAAeTm5gIA9uzZg1atWqF9+/Z1t1JEMrFznEiGp556Ch999BGuXbsGOzs7rF+/HpaWlli8eDFGjRqFtm3bom/fvlWWmzBhAk6ePIkxY8bAxsYGTz/9NAICAvDEE09g+vTpCAoKgkajgZ2dHTZt2gQLC/6mI/PF0XGJaun333/H6tWrsW/fPlOHQmRS/FlDRESysMVBRESysMVBRESyMHEQEZEsTBxERCQLEwcREcnCxEFERLIwcRARkSz/D8KXIiLY1PYuAAAAAElFTkSuQmCC\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for i in treino_full.drop('sepse', axis = 1).columns:\n",
"\n",
" sns.set(style=\"ticks\")\n",
"\n",
" x = treino[i]\n",
" coluna = i\n",
" mu = round(x.mean(),2) # mean of distribution\n",
" sigma = round(x.std(),2) # standard deviation of distribution\n",
"\n",
" f, (ax_box, ax_hist) = plt.subplots(2)\n",
"\n",
" sns.boxplot(x=x, ax=ax_box)\n",
" sns.histplot(x=x, ax=ax_hist)\n",
"\n",
" ax_box.set(yticks=[])\n",
" sns.despine(ax=ax_hist)\n",
" sns.despine(ax=ax_box, left=True)\n",
" ax_box.set_title('Boxplot e Histograma de {}\\n $\\mu={}$, $\\sigma={}$'.format(coluna, mu,sigma))\n",
"\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sCMYgs7wpOuM"
},
"source": [
"Pelo fato de termos obervados valores positivos e negativos para a sepse em relação às variáveis pulso, pa_min e pa_max nos valores apresentados como outlier, mas olhando a base antes do preenchiemnto dos nan, decidimos por não aumentar mais os valores a serem considerados como nan. Todavia, fica um ponto de atenção a ser tratado logo na coleta dos dados, pois nesses casos, não da para saber se de fato ocorreu o valor apresentado ou se também foi algum tipo de erro."
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 359
},
"id": "x4wmehOspvdI",
"outputId": "dbdeaf91-4575-4051-b0be-f74565457420"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 1287 \n",
" 36.0 \n",
" 135.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 1292 \n",
" 38.0 \n",
" 130.0 \n",
" NaN \n",
" 126.0 \n",
" 98.0 \n",
" 0 \n",
" \n",
" \n",
" 1303 \n",
" 36.0 \n",
" 144.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 1320 \n",
" NaN \n",
" 131.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 1358 \n",
" 35.0 \n",
" 133.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 1480 \n",
" 39.0 \n",
" 140.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 1500 \n",
" 35.0 \n",
" 154.0 \n",
" NaN \n",
" 200.0 \n",
" 100.0 \n",
" 0 \n",
" \n",
" \n",
" 1644 \n",
" 36.0 \n",
" 130.0 \n",
" NaN \n",
" 123.0 \n",
" 77.0 \n",
" 0 \n",
" \n",
" \n",
" 1679 \n",
" 36.0 \n",
" 138.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 0 \n",
" \n",
" \n",
" 1695 \n",
" 36.0 \n",
" 152.0 \n",
" NaN \n",
" 202.0 \n",
" 119.0 \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"1287 36.0 135.0 NaN NaN NaN 0\n",
"1292 38.0 130.0 NaN 126.0 98.0 0\n",
"1303 36.0 144.0 NaN NaN NaN 0\n",
"1320 NaN 131.0 NaN NaN NaN 0\n",
"1358 35.0 133.0 NaN NaN NaN 0\n",
"1480 39.0 140.0 NaN NaN NaN 0\n",
"1500 35.0 154.0 NaN 200.0 100.0 0\n",
"1644 36.0 130.0 NaN 123.0 77.0 0\n",
"1679 36.0 138.0 NaN NaN NaN 0\n",
"1695 36.0 152.0 NaN 202.0 119.0 0"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('pulso >= 130 and sepse == 0').head(10) #185 linhas com sepse == 1 e 209 com sepse == 0"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 359
},
"id": "Q8rxb197qqxG",
"outputId": "88336f15-7b6f-4aa1-87ed-039ac0012f85"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 8 \n",
" 35.0 \n",
" 67.0 \n",
" NaN \n",
" 153.0 \n",
" 118.0 \n",
" 1 \n",
" \n",
" \n",
" 14 \n",
" NaN \n",
" 102.0 \n",
" NaN \n",
" 185.0 \n",
" 118.0 \n",
" 1 \n",
" \n",
" \n",
" 28 \n",
" 36.0 \n",
" 104.0 \n",
" NaN \n",
" 170.0 \n",
" 112.0 \n",
" 1 \n",
" \n",
" \n",
" 32 \n",
" 35.0 \n",
" 108.0 \n",
" NaN \n",
" 147.0 \n",
" 114.0 \n",
" 1 \n",
" \n",
" \n",
" 76 \n",
" 35.0 \n",
" 90.0 \n",
" NaN \n",
" 204.0 \n",
" 123.0 \n",
" 1 \n",
" \n",
" \n",
" 149 \n",
" 35.0 \n",
" 74.0 \n",
" NaN \n",
" 153.0 \n",
" 114.0 \n",
" 1 \n",
" \n",
" \n",
" 163 \n",
" 39.0 \n",
" 75.0 \n",
" NaN \n",
" 178.0 \n",
" 111.0 \n",
" 1 \n",
" \n",
" \n",
" 204 \n",
" 35.0 \n",
" 84.0 \n",
" NaN \n",
" 170.0 \n",
" 110.0 \n",
" 1 \n",
" \n",
" \n",
" 276 \n",
" NaN \n",
" 107.0 \n",
" NaN \n",
" 182.0 \n",
" 114.0 \n",
" 1 \n",
" \n",
" \n",
" 301 \n",
" 37.0 \n",
" 115.0 \n",
" NaN \n",
" 186.0 \n",
" 112.0 \n",
" 1 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"8 35.0 67.0 NaN 153.0 118.0 1\n",
"14 NaN 102.0 NaN 185.0 118.0 1\n",
"28 36.0 104.0 NaN 170.0 112.0 1\n",
"32 35.0 108.0 NaN 147.0 114.0 1\n",
"76 35.0 90.0 NaN 204.0 123.0 1\n",
"149 35.0 74.0 NaN 153.0 114.0 1\n",
"163 39.0 75.0 NaN 178.0 111.0 1\n",
"204 35.0 84.0 NaN 170.0 110.0 1\n",
"276 NaN 107.0 NaN 182.0 114.0 1\n",
"301 37.0 115.0 NaN 186.0 112.0 1"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('pa_max >= 110 and sepse == 1').head(10) #24 com sepse == 1 e 108 sepse == 0"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 173
},
"id": "YrNUj3sFsB7G",
"outputId": "0961c651-0ff7-4936-b3ac-183d9903ddc3"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 258 \n",
" 35.0 \n",
" 38.0 \n",
" NaN \n",
" 155.0 \n",
" 78.0 \n",
" 1 \n",
" \n",
" \n",
" 941 \n",
" 35.0 \n",
" 40.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 1 \n",
" \n",
" \n",
" 965 \n",
" 35.0 \n",
" 32.0 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" 1 \n",
" \n",
" \n",
" 8931 \n",
" 35.0 \n",
" 38.0 \n",
" NaN \n",
" 155.0 \n",
" 78.0 \n",
" 1 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"258 35.0 38.0 NaN 155.0 78.0 1\n",
"941 35.0 40.0 NaN NaN NaN 1\n",
"965 35.0 32.0 NaN NaN NaN 1\n",
"8931 35.0 38.0 NaN 155.0 78.0 1"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('pulso <= 40 and sepse == 1').head(10) #10 sepse == 0 e 4 com sepse == 1"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 545
},
"id": "7wBnvF3xswm3",
"outputId": "7728d8c5-aed7-4628-d5a8-39959337c58a"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" temperatura \n",
" pulso \n",
" respiracao \n",
" pa_min \n",
" pa_max \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 837 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 1054 \n",
" 35.0 \n",
" 56.0 \n",
" NaN \n",
" 140.0 \n",
" 11.0 \n",
" 1 \n",
" \n",
" \n",
" 9004 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 9162 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 9205 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 9478 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 9520 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 9547 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 9772 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 9790 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 9947 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 9979 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 10027 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 10064 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 10098 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
" 10322 \n",
" NaN \n",
" 131.0 \n",
" 39.0 \n",
" 82.0 \n",
" 37.0 \n",
" 1 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temperatura pulso respiracao pa_min pa_max sepse\n",
"837 NaN 131.0 39.0 82.0 37.0 1\n",
"1054 35.0 56.0 NaN 140.0 11.0 1\n",
"9004 NaN 131.0 39.0 82.0 37.0 1\n",
"9162 NaN 131.0 39.0 82.0 37.0 1\n",
"9205 NaN 131.0 39.0 82.0 37.0 1\n",
"9478 NaN 131.0 39.0 82.0 37.0 1\n",
"9520 NaN 131.0 39.0 82.0 37.0 1\n",
"9547 NaN 131.0 39.0 82.0 37.0 1\n",
"9772 NaN 131.0 39.0 82.0 37.0 1\n",
"9790 NaN 131.0 39.0 82.0 37.0 1\n",
"9947 NaN 131.0 39.0 82.0 37.0 1\n",
"9979 NaN 131.0 39.0 82.0 37.0 1\n",
"10027 NaN 131.0 39.0 82.0 37.0 1\n",
"10064 NaN 131.0 39.0 82.0 37.0 1\n",
"10098 NaN 131.0 39.0 82.0 37.0 1\n",
"10322 NaN 131.0 39.0 82.0 37.0 1"
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"treino.query('pa_max < 40 and sepse == 1') #16 com sepse == 1 e 14 com sepse == 0"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9S4du3u-g6rf"
},
"source": [
"## Treino e teste"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {
"id": "Zy7d615vhqlZ"
},
"outputs": [],
"source": [
"X = treino_full.drop('sepse' , axis =1)\n",
"y = treino_full['sepse']\n",
"X = X.to_numpy()\n",
"y = y.to_numpy()"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {
"id": "mMcJL1Aahn8v"
},
"outputs": [],
"source": [
"# Importa bibliotecas\n",
"from sklearn.model_selection import train_test_split\n",
"\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, stratify = y, random_state=42)"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {
"id": "NOqSg8Pniv0-"
},
"outputs": [],
"source": [
"from sklearn.preprocessing import MinMaxScaler\n",
"\n",
"MinMax = MinMaxScaler()\n",
"\n",
"X_train = MinMax .fit_transform(X_train)\n",
"X_test = MinMax .transform(X_test)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tS4G6uH3rh_o"
},
"source": [
"## Regressão Logística"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "QIl3TsIrpmuJ",
"outputId": "1ede9433-1556-4ea6-f525-e41fbdac7048"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Acuracia média: 0.5253959907920419\n",
"Precisão média: 0.6583524278348761\n",
"F1 médio: 0.5141035935161595\n"
]
}
],
"source": [
"from sklearn.linear_model import LogisticRegression\n",
"from sklearn.model_selection import StratifiedKFold\n",
"from sklearn.metrics import balanced_accuracy_score\n",
"from sklearn.metrics import accuracy_score\n",
"from sklearn.metrics import f1_score\n",
"from sklearn.metrics import precision_score\n",
"\n",
"precisao = []\n",
"f1 = []\n",
"acuracia = []\n",
"\n",
"cv = StratifiedKFold(n_splits=10, shuffle=True, random_state=1)\n",
"\n",
"for train_index, test_index in cv.split(X, y):\n",
" x_train, x_test = X[train_index], X[test_index]\n",
" y_train, y_test = y[train_index], y[test_index]\n",
" model = LogisticRegression()\n",
" model.fit(x_train,y_train)\n",
" y_pred = model.predict(x_test)\n",
"\n",
" precisao.append(precision_score(y_test, y_pred, average=\"macro\"))\n",
" f1.append(f1_score(y_test, y_pred, average=\"macro\"))\n",
" acuracia.append(balanced_accuracy_score(y_test, y_pred))\n",
"print('Acuracia média: ', np.mean(acuracia))\n",
"print('Precisão média: ', np.mean(precisao))\n",
"print('F1 médio: ', np.mean(f1))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wcRIYhMJl4EV"
},
"source": [
"## Naive Bayes"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "I8FbiFTRl6pE",
"outputId": "df8a9bbe-a829-454d-97e5-cafa13949415"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Acuracia média: 0.5833666999524743\n",
"Precisão média: 0.6585220325685529\n",
"F1 médio: 0.5997136272100843\n"
]
}
],
"source": [
"from sklearn.naive_bayes import GaussianNB \n",
"\n",
"precisao = []\n",
"f1 = []\n",
"acuracia = []\n",
"\n",
"cv = StratifiedKFold(n_splits=10, shuffle=True, random_state=1)\n",
"\n",
"for train_index, test_index in cv.split(X, y):\n",
" x_train, x_test = X[train_index], X[test_index]\n",
" y_train, y_test = y[train_index], y[test_index]\n",
" model = GaussianNB()\n",
" model.fit(x_train,y_train)\n",
" y_pred = model.predict(x_test)\n",
"\n",
" precisao.append(precision_score(y_test, y_pred, average=\"macro\"))\n",
" f1.append(f1_score(y_test, y_pred, average=\"macro\"))\n",
" acuracia.append(balanced_accuracy_score(y_test, y_pred))\n",
"print('Acuracia média: ', np.mean(acuracia))\n",
"print('Precisão média: ', np.mean(precisao))\n",
"print('F1 médio: ', np.mean(f1))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zUYcEQECmgjr"
},
"source": [
"## Random forest"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "YNRbbSeYmiVq",
"outputId": "6d6e5745-606a-440d-99b8-c1ae27e547ca"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Acuracia média: 0.6619836318079815\n",
"Precisão média: 0.7124411235618712\n",
"F1 médio: 0.6808835483339327\n"
]
}
],
"source": [
"from sklearn.ensemble import RandomForestClassifier\n",
"\n",
"precisao = []\n",
"f1 = []\n",
"acuracia = []\n",
"\n",
"cv = StratifiedKFold(n_splits=10, shuffle=True, random_state=1)\n",
"\n",
"for train_index, test_index in cv.split(X, y):\n",
" x_train, x_test = X[train_index], X[test_index]\n",
" y_train, y_test = y[train_index], y[test_index]\n",
" model = RandomForestClassifier()\n",
" model.fit(x_train,y_train)\n",
" y_pred = model.predict(x_test)\n",
"\n",
" precisao.append(precision_score(y_test, y_pred, average=\"macro\"))\n",
" f1.append(f1_score(y_test, y_pred, average=\"macro\"))\n",
" acuracia.append(balanced_accuracy_score(y_test, y_pred))\n",
"print('Acuracia média: ', np.mean(acuracia))\n",
"print('Precisão média: ', np.mean(precisao))\n",
"print('F1 médio: ', np.mean(f1))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zLDQtjXPnVN7"
},
"source": [
"## KNN"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "uH2XEKcWnWfl",
"outputId": "3b89f55a-714a-4187-f5c8-0be2710ee074"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Acuracia média: 0.6175641714353243\n",
"Precisão média: 0.7199765038567818\n",
"F1 médio: 0.6417681902289986\n"
]
}
],
"source": [
"from sklearn.neighbors import KNeighborsClassifier\n",
"\n",
"precisao = []\n",
"f1 = []\n",
"acuracia = []\n",
"\n",
"cv = StratifiedKFold(n_splits=10, shuffle=True, random_state=1)\n",
"\n",
"for train_index, test_index in cv.split(X, y):\n",
" x_train, x_test = X[train_index], X[test_index]\n",
" y_train, y_test = y[train_index], y[test_index]\n",
" model = KNeighborsClassifier(n_neighbors= 20)\n",
" model.fit(x_train,y_train)\n",
" y_pred = model.predict(x_test)\n",
"\n",
" precisao.append(precision_score(y_test, y_pred, average=\"macro\"))\n",
" f1.append(f1_score(y_test, y_pred, average=\"macro\"))\n",
" acuracia.append(balanced_accuracy_score(y_test, y_pred))\n",
"print('Acuracia média: ', np.mean(acuracia))\n",
"print('Precisão média: ', np.mean(precisao))\n",
"print('F1 médio: ', np.mean(f1))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Kv_yDFnXtwJ6"
},
"source": [
"Preenchendo os NAs do teste com o KNNImputer"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {
"id": "6ZZ-760yFnCO"
},
"outputs": [],
"source": [
"# knn imputation teste\n",
"from numpy import isnan\n",
"from sklearn.impute import KNNImputer\n",
"\n",
"imputer = KNNImputer()\n",
"# fit on the dataset\n",
"imputer.fit(teste)\n",
"# transform the dataset\n",
"Xtransteste = imputer.transform(teste)\n",
"\n",
"#Teste pós imputação\n",
"teste = pd.DataFrame(Xtransteste, columns = teste.columns)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "eXQ0ifJut2WD"
},
"source": [
"## Submetendo os resultados no Kaggle usando o modelo de classificação: Floresta Aleatória"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {
"id": "ZNXPJlWRt9C4"
},
"outputs": [
{
"data": {
"text/plain": [
"0 6718\n",
"1 582\n",
"Name: sepse, dtype: int64"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"RF_model = RandomForestClassifier()\n",
"X = treino_completo.drop('sepse', axis =1)\n",
"y = y_completo\n",
"\n",
"RF_model.fit(X,y)\n",
"\n",
"y_pred = RF_model.predict(teste) \n",
"y_pred = np.array(y_pred, dtype = int)\n",
"prediction = pd.DataFrame()\n",
"prediction['id'] = id\n",
"prediction['sepse'] = y_pred\n",
"\n",
"prediction['sepse'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {
"id": "ihs-55hov3ZJ"
},
"outputs": [],
"source": [
"prediction.to_csv('RFsimples.csv', index = False)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AJcamxM4xs9g"
},
"source": [
"0.91411 Kaggle foi nossa melhor pontuação"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {
"id": "vFn2WBRfv-_P"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" id \n",
" sepse \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 1 \n",
" 0 \n",
" \n",
" \n",
" 1 \n",
" 2 \n",
" 0 \n",
" \n",
" \n",
" 2 \n",
" 3 \n",
" 0 \n",
" \n",
" \n",
" 3 \n",
" 4 \n",
" 0 \n",
" \n",
" \n",
" 4 \n",
" 5 \n",
" 0 \n",
" \n",
" \n",
" 5 \n",
" 6 \n",
" 0 \n",
" \n",
" \n",
" 6 \n",
" 7 \n",
" 0 \n",
" \n",
" \n",
" 7 \n",
" 8 \n",
" 0 \n",
" \n",
" \n",
" 8 \n",
" 9 \n",
" 0 \n",
" \n",
" \n",
" 9 \n",
" 10 \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" id sepse\n",
"0 1 0\n",
"1 2 0\n",
"2 3 0\n",
"3 4 0\n",
"4 5 0\n",
"5 6 0\n",
"6 7 0\n",
"7 8 0\n",
"8 9 0\n",
"9 10 0"
]
},
"execution_count": 66,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"prediction.head(10)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "H6vWdpi_9ae3"
},
"source": [
"## Submetendo os resultados no Kaggle usando o modelo de classificação: KNN"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {
"id": "FoZ_snM29eTQ"
},
"outputs": [
{
"data": {
"text/plain": [
"0 6715\n",
"1 585\n",
"Name: sepse, dtype: int64"
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"KNN_model = KNeighborsClassifier(n_neighbors= 20)\n",
"X = treino_completo.drop('sepse', axis =1)\n",
"y = y_completo\n",
"\n",
"KNN_model.fit(X,y)\n",
"\n",
"y_pred = KNN_model.predict(teste) \n",
"y_pred = np.array(y_pred, dtype = int)\n",
"prediction = pd.DataFrame()\n",
"prediction['id'] = id\n",
"prediction['sepse'] = y_pred\n",
"\n",
"prediction['sepse'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {
"id": "owDXsUu5-H4x"
},
"outputs": [],
"source": [
"prediction.to_csv('KNN20.csv', index = False)"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [],
"name": "Final - Sepse.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.8"
}
},
"nbformat": 4,
"nbformat_minor": 1
}